Related Work In contrast, cilkNOW does provide support for parallel programs and it doesprovide adaptive parallelism and fault tolerance, but it does so only for the http://www.usenix.org/publications/library/proceedings/ana97/full_papers/blumofe
Extractions: Next: Conclusion Up: Adaptive and Reliable Previous: Cilk-NOW macroscheduling Cilk-NOW is unique in delivering adaptive and reliable execution for parallel programs on networks of workstations. Traditionally, systems such as PVM [ ], TreadMarks [ ], and others [ ] that are designed to support parallel programs on networks of workstations have not provided adaptive parallelism or fault tolerance. On the other hand, most systems that do provide support for adaptive execution or fault tolerance take a "process-centric" approach. That is, they provide an abstraction of mobile processes and/or an abstraction of reliable processes. As such these systems are very general in their potential application, but they do not provide much support for parallel programs. In contrast, Cilk-NOW does provide support for parallel programs and it does provide adaptive parallelism and fault tolerance, but it does so only for the Cilk parallel programming model. Such specificity allows the Cilk-NOW design to take an end-to-end approach [ ] that leverages properties of the Cilk programming model in order to implement adaptive parallelism and fault tolerance simply and efficiently.
Efficient Detection Of Determinacy Races In Cilk Programs Efficient detection of determinacy races in cilk programs Keith H. Randall ,Andrew F. Stark, Detecting data races in cilk programs that use locks, http://portal.acm.org/citation.cfm?id=258493
The Implementation Of The Cilk-5 Multithreaded Language Many cilk programs run on one processor with virtually no degradation comparedto equivalent C programs. This paper describes how the workfirst principle http://portal.acm.org/citation.cfm?id=277725
SourceForge.net Cilk-support by the execution of a cilk program *after* the program has completed. that has three important properties 1) a cilk program admits a `C elision , http://sourceforge.net/mailarchive/forum.php?forum_id=6784&max_rows=25&style=nes
SuperTech Papers Efficient Detection of Determinacy Races in cilk Programs, by Mingdong Feng, andCharles E. Leiserson. To appear in 9th Annual ACM Symposium on Parallel http://theory.lcs.mit.edu/~supertech/papers.html
Research Abstracts 2004 A cilk program with work T1 and criticalpath length expected time on P processorsusing Efficient detection of determinacy races in cilk programs. http://www.csail.mit.edu/research/abstracts/abstracts04/html/38/38.html
Extractions: On-the-Fly Maintenance of Series-Parallel Relationships in Fork-Join Multithreaded Programs What: A key capability needed by some data-race detectors for fork-join multithreaded programming models, such as for MIT's Cilk [5, 2] system, is the ability to determine whether two threads are logically in series or in parallel. We provide two algorithm to maintain series-parallel (SP) relationships "on the fly," one serial and one parallel. The serial SP-order algorithm runs in 0(1) amortized time per operation. In contrast, the previously best algorithm requires a time per operation that is proportional to Tarjan's functional inverse of Ackermann's function. Thus, our new SP-maintenance algorithm immediately yields an improved determinacy-race detector. In particular, any fork-join program running in T1 time on a single processor can be checked on the fly for determinacy races in O (T1) time. Corresponding improved bounds can also be obtained for more sophisticated data-race detectors, for example, those that use locks. By combining our SP-order algorithm with Feng and Leiserson's serial SP-bags algorithm [4], we obtain a parallel SP-maintenance algorithm, called
Analysis Of Multithreaded Programs the regions of memory accessed by procedures in multithreaded cilk programs.In conjunction with a pointer analysis for multithreaded cilk programs, http://www.cag.lcs.mit.edu/~rinard/paper/analysisOfMultithreadedPrograms.html
Extractions: Surveys research in the analysis of multithreaded programs. Identifies two classes of multithreaded programs, parallel computing programs and activity management programs , and discusses analyses that are appropriate for each class. Also discusses issues associated with weak memory consistency models and multithreading. Pointer and Escape Analysis for Multithreaded Programs Presents a combined pointer and escape analysis for Java programs with unstructured multithreading. The algorithm analyzes interactions between threads to obtain precise points-to and escape information for objects accessible to multiple threads. The results are used for synchronization elimination and elimination of dynamic checks associated with the use of region-based memory allocation. Symbolic Bounds Analysis of Pointers, Array Indices, and Accessed Memory Regions
CSAIL Research Abstract Detecting Data Races in cilk Programs that Use Locks. In Proceedings of the TenthAnnual ACM Symposium on Parallel Algorithms and Architectures (SPAA 98), http://publications.csail.mit.edu/abstracts/abstracts05/tck/tck.html
Extractions: Theory Introduction This project deals with the implementation of a provably good data-race detector, called the Nondeterminator-3, which runs efficiently in parallel. A data-race occurs in a multithreaded program when two logically parallel threads access the same location while holding no common locks and at least one of the accesses is a write. The Nondeterminator-3 checks for data-races "on the fly", in programs coded in Cilk , a shared-memory multithreaded programming language. The Nondeterminator-3 succeeds the serially running Nondeterminator-2 . Although data-race detectors are only debugging tools, there are many reasons for building an efficient parallel data-race detector. Firstly, a parallel data-race detector would enable faster debugging of races in parallel programs. An on-the-fly data-race detector that preserves the parallelism of an application program would enable the program to be run with race detection options always turned on; a serially-running data-race detector, on the other hand, would not enable such real-time testing. The Nondeterminator-3 implementation also helps in demonstrating the possibility of using the underlying theoretical ideas to obtain a data-race detector that achieves good speed-up in practice.
Schedule (Note That The Papers Appear In Your Packets Sorted (10) 230 Detecting Data Races in cilk Programs that Use Locks. G.I. Cheng, CELeiserson, K. Randall, A. Stark, and M. Feng. 250 Discussion 320 Break http://bradley.csail.mit.edu/~bradley/ymp-workshop/schedule.text
Extractions: Schedule (Note that the papers appear in your packets sorted alphabetically by the first author.) The format of the sessions will be a twenty-minute talk including only one or two minutes of questions. At the end of each session we will have a half hour for discussion. Monday June 8: Session 1: Teaching (1) 8:30 Teaching Parallel Algorithms Using the Cilk Multithreaded Programming Language. C.E. Leiserson. (Only 10 minutes for discussion) Session 2: Implementation I (2) 9:00 The Implementation of the Cilk-5 Multithreaded Language. K. Randall. (3) 9:20 Thread Scheduling for Multiprogrammed Multiprocessors. R.D. Blumofe. (4) 9:40 A Nonblocking Cilk Implementation. P. Lisiecki. 10:00 Discussion 10:30 Break Session 3: Performance (5) 11:00 Predicting Price/Performance Trade-offs for Whitney: A Commodity Computing Cluster. J.C. Becker, B. Nitzberg, R.F. Van Der Wijngaart, M. Yarrow, C. Kuszmaul. (6) 11:20 A Comparison of Multithreading Implementations. S.R. Taylor. (7) 11:40 Exploitation of Multithreading to Improve Program Performance. W.E. Cohen, N. Yalamanchilli, R. Tewari, C. Patel, and T. Kazi. 12:00 Discussion 12:30-2:00 Lunch (provided at the Peabody) Session 4: Applications (8) 2:00 Construction of a Multithreaded Chess Program. D. Dailey. (9) 2:20 Nops: A Conservative Parallel Simulation Engine for TeD. Poplawski, and Nicol. (10) 2:40 Using Multithreading for the Automatic Load Balancing of 2D Adaptive Finite Element Meshes. G. Heber, R. Biswas, P. Thulasiraman, and G.R. Gao. (11) 3:00 Applying Multi-threaded Programming to the Simulation of Virus Shell Self-Assembly. R. Schwartz, and B. Berger. 3:20 Discussion 3:50 Break Session 5: Memory (12) 4:20 Computation-Centric Memory Models. M. Frigo. (13) 4:40 Space Efficient Execution of Deterministic Parallel Programs. D.J. Simpson, and F.W. Burton. 5:00 Discussion 6:00-10:00 Dinner (provided at the peabody) Tuesday June 9: Session 6: Implementation II (1) 8:30 Communications-Efficient Multithreading on Limited-Bandwidth Networks. M.S. Bernstein, and B.C. Kuszmaul. (2) 8:50 Indolent Closure Creation. V. Strumpen. (3) 9:10 Scheduling Adaptively Parallel Jobs. B. Song. 9:30 Discussion 10:00 Break Session 7: Multithreaded Programming Systems (4) 10:30 Space-Efficient scheduling of Nested Parallelism. Narlikar. (5) 10:50 A Compiler for pH An Implicitly Parallel, Multithreaded Haskell. A. Caro, and J.W. Maessen. (6) 11:10 Speculative Processing Using Active Objects. C.-K. Yuen, and M. Feng. (7) 11:30 Athapascan-1: Parallel Programming with Asynchronous Tasks. G.G.H. Cavalheiro, F. Galilee, and J.-L. Roch. (8) 11:50 Stampede - A Programming System for Emerging Scalable Interactive Multimedia Applications. R.S. Nikhil, U. Ramachandran, J. Rehg, R.H. Halstead, C.F. Joerg, and L. Kontothanassis. 12:10 Discussion 12:40-2:10 Lunch Session 8: Debugging (9) 2:10 Efficient Detection of Determinacy Races in Cilk Programs. M. Feng, and C.E. Leiserson. (10) 2:30 Detecting Data Races in Cilk Programs that Use Locks. G.-I. Cheng, C. E. Leiserson, K. Randall, A. Stark, and M. Feng. 2:50 Discussion 3:20 Break Session 9: Compilers (11) 3:50 Design of a Restructuring Compiler for Contemporary Languages and Architectures. L. Harrison. (12) 4:10 A Compiler Technique for Speculative Execution for Alternative Program Paths Targeting Multithreaded Architectures. A. Unger, T. Ungerer, and E. Zehendner. (13) 4:30 Compiling for Multithreaded Architectures. X. Tang. 4:50 Discussion 5:20 EOW (End of Workshop)
Bradley C. Kuszmaul research investigates how to run cilk programs efficiently on the internet.I want to build a chess program that runs on 100000 processors on the web. http://bradley.csail.mit.edu/~bradley/
Extractions: I am a Research Scientist at in the Supercomputing Technologies Group at the MIT Laboratory for Computer Science. Akamai Technologies , and before that I was an assistant professor in the Yale University Department of Computer Science with a joint appointment in the Yale University Department of Electrical Engineering. (Note: Many of the following links are not yet working, as I have not transfered them all form Yale yet.) My research applies algorithm design to solve systems problems in high-performance computing. I was one of the principal architects of the Connection Machine CM-5, and am the co-author of two world-class computer chess programs (StarTech and *Socrates .) I participated at MIT in the Cilk development project, which provides an algorithmic multithreaded programming system. As an assistant professor at Yale I worked on the Ultrascalar Project , in which we improved the theoretical bounds for how fast a superscalar processor's clock can run, as a function of the window size or the issue width. We also had an 8-issue out-of-order processor fabricated in a 0.18 micron copper/low-K VLSI process. Recently we have been working on developing the mechanisms for a speculative dataflow processor. I co-taught Theory of Parallel Hardware in Spring 2004.
Extractions: skip to content Advanced Search Course Home Syllabus Calendar ... Theory of Parallel Systems, Fall 2003 Any number of development tools can be used to read the .asm files in this section. These files contain assembly source code. Any number of development tools can also be used to compile and run the .c files in this section.
Extractions: Search: Advanced Search Home Digital Library Site Map ... 1st IEEE Computer Society International Workshop on Cluster Computing p. 43 Evaluation of the Performance of Multithreaded Cilk Runtime System on SMP Clusters Liang Peng , National University of Singapore Mingdong Feng , National University of Singapore Chung-Kwong Yuen , National University of Singapore Full Article Text: DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/IWCC.1999.810808 Abstract Back to Top Additional Information Index Terms- cluster computing, multithreading, performance evaluation. Citation: Liang Peng, Mingdong Feng, Chung-Kwong Yuen. "Evaluation of the Performance of Multithreaded Cilk Runtime System on SMP Clusters," iwcc , vol. 00, no. , p. 43, 1st 1999. Abstract Contents: Abstract Index Terms Citation Free access to Electronic subscribers log in to Subscription information Get a Web account Usage of this product signifies your acceptance of the
Testing Efficient Detecting of Determinacy Races in cilk Programs 9th Annual ACM Symposiumon Parallel Algorithms and Architectures SPAA 97 June 2225, 1997, http://user.it.uu.se/~hessel/testing/
Extractions: hessel paupet bengt (Subject to change). Thursday Introduction and Basic concepts (to the meeting) Wednesday Test case generation (to the meeting) Thursday Data Flow Analysis (to the meeting) Thursday Object-Oriented Testing (to the meeting) Monday Joint venture with VnV at Scientific Computing (to the meeting) Thursday Test Oracles (to the meeting) Monday Seminar by Yves Ledru info here Thursday Regression Testing (to the meeting) Thursday FME02 papers (to the meeting) Thursday Timing (to the meeting) Thursday Testing timed automata (to the meeting) Wednesday Distributed systems / Race Conditions (to the meeting) Thursday Model generation (to the meeting) Friday Testing of configurable systems, Mats Grindal (ENEA) (to the meeting) Thursday Software testing in a small company environment, Carl Ericksson (to the meeting) Articles should have been read before the meeting, the articles are listed in decreasing importance.
Extractions: This paper is cited in the following contexts: Parallel and Fully Recursive - Multifrontal Sparse Cholesky (Correct) ....matrix to frontal matrix, factor the front, apply updates, copy columns to L, and free columns from front return front; Figure 3: Simplified Cilk code for the multifrontal Cholesky factorization with inlets to manage memory and synchronize extend add operations. overview, and for the technical details. The third source of overhead is the memory system. Cilk uses a shared memory model. Most parallel computers, even those with hardware support for shared memory programming, have memories that are distributed to some extent. In some cases the main memory is shared but ....
[computer-go] Hardware-Instruction. If you write a cilk program and run it as a 1 processor program, it will runnearly as fast as the same program written without cilk, usually within 1 or http://computer-go.org/pipermail/computer-go/2004-November/001480.html
Extractions: Fri Nov 5 16:35:28 PST 2004 A little more about Cilk, since Vincent grossly mispresented it. Cilk is really C with a few simple constructs added; spawn, sync, inlet and abort. An inlet can catch the result of a computation and can abort work inside it. This mechanism makes it perfect for alpha beta searching. Any function that begins with the keyword "cilk" can be spawned in parallel. The scheduler is based on work stealing. In cilk you don't think about the number of processors, this is abstracted away. Any time a "sync" keyword is invoked, the processor doesn't idle, it randomly grabs work from somewhere else. The beauty of cilk is that it rarely invokes overhead for these parallel constructs. The cilk pre-processor constructs 2 versions of each "cilk" function, one that is purely serial and knows nothing
[computer-go] Supercomputers And Search If you write a cilk program and run it as a 1 processor program, it will runnearly as fast as the same program written without cilk, usually within http://computer-go.org/pipermail/computer-go/2004-November/001501.html
Extractions: Sat Nov 6 15:10:18 PST 2004 Yes, I know computer chess has advanced. I gave up lazy eval before I left computer chess because it wasn't safe unless I used big margins, at which point it didn't help much at all. I predicted before anyone that the quality of evaluation functions would be the future of computer chess more than speed. This was when others were touting speed. I proved this to myself via simulations, better evaluations improved more rapidly with deeper searches. I think Berliner pretty much demonstrated this too. But why MTD no longer used? Is it because MTD is so hard on the hash tables and memory latency is a bigger issue? MTD is much more dependant on the fact that the search tree remains mostly stored in memory via the hash tables so I can believe this is possible. Nullmove nowadays gets used at R=3, usually not near the leafs.
References: Detecting data races in cilk programs that use locks. In Proceedings of the TenthAnnual ACM Symposium on Parallel Algorithms http://dsl.cs.technion.ac.il/projects/multirace/references.htm
Extractions: References: [1] S. V. Adve and M. D. Hill. Weak orderinga new definition. In Proceedings of the 17th Annual International Symposium on Computer Architecture , pages 214, May 1990. [2] S. V. Adve and M. D. Hill. A unified formalization of four shared-memory models. Technical report, University of Wisconsin , Sept. 1992. [3] S. V. Adve , M. D. Hill, B. P. Miller, and R. H. B. Netzer Detecting data races on weak memory systems. In Proceedings of the 18th Annual International Symposium on Computer Architecture (ISCA91) , pages 234243, May 1991. [4] D. Bailey, J. Barton, T. Lasinski , and H. Simon. The NAS parallel benchmark. Technical report, NASA Ames , Aug. 1991. [5] V. Balasundaram and K. Kennedy. Compile-time detection of race conditions in a parallel program. In Proceedings of the 3rd International Conference on Supercomputing , pages 175185, June 1989. [6] T. Brecht and H. Sandhu The Region Trap Library: Handling traps on application-defined regions of memory. In USENIX Annual Technical Conference, Monterey CA , June 1999. [7] G. Cheng, M.
Bib Efficient detection of determinacy races in cilk programs. by Mingdong Feng andCharles E. Leiserson. In Proceedings of the Ninth Annual ACM Symposium on http://web.yl.is.s.u-tokyo.ac.jp/meeting/bib.html
Extractions: PS file is available. cY 1997 10 23 Executing Multithreaded Programs Efficiently. By Robert D. Blumofe. Ph.D. Thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology. Cilk: An Efficient Multithreaded Runtime System. By Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. The Journal of Parallel and Distributed Computing, 37(1), pages 55-69, August, 1996. A shorter version of this paper appeared in Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pages 207-216, Santa Barbara, California, July 19-21, 1995. Scheduling Multithreaded Computations by Work Stealing. By Robert D. Blumofe, and Charles E. Leiserson. Proceedings of the 35th Annual Symposium on Foundations of Computer Science (FOCS), pages 356-368, Santa Fe, New Mexico, November 20-22, Space-Efficient Scheduling of Multithreaded Computations. By Robert D. Blumofe, and Charles E. Leiserson.
Cilk-users Info Page About cilkusers. This is an open list for general discussion about the cilkprogramming language. However, to post you must be a member of the list. http://lists.sourceforge.net/mailman/listinfo/cilk-users
Extractions: However, to post you must be a member of the list. To see the collection of prior postings to the list, visit the cilk-users Archives Using cilk-users To post a message to all the list members, send email to cilk-users@lists.sourceforge.net You can subscribe to the list, or change your existing subscription, in the sections below. Subscribing to cilk-users Subscribe to cilk-users by filling out the following form. You will be sent email requesting confirmation, to prevent others from gratuitously subscribing you. This is a hidden list, which means that the members list is available only to the list administrator. Your email address: You must enter a privacy password. This provides only mild security, but should prevent others from messing with your subscription. Do not use a valuable password as it will occasionally be emailed back to you in cleartext. Once a month, your password will be emailed to you as a reminder. Pick a password: Reenter password to confirm: Would you like to receive list mail batched in a daily digest?
Archives Of The Caml Mailing List > Message From Xavier Leroy Last year, a team at École Normale Supérieure ranked second with a program writtenin OCaml, with the first prize going to a cilk program (cilk is C + http://caml.inria.fr/pub/ml-archives/caml-list/1999/08/66b9e5b2979fe6f478a54b2c4