Message Area
 
 
Cloud Computing
CS 683
[[Announcements]]\n[[Syllabus]]\nPapersToDiscuss\n
Class location: EB 204\nTue 13:40-14:40\nThu 15:40-17:40\noffice hours: Thu 14:40-15:40 at EA 223\ninstructor: [[Dr. Murat Demirbaş | index.html]]\n\n!! Topics\nThis class covers several systems topics in cloud computing, including \n* concept and motivation,\n* virtualization technologies,\n* architectures,\n* networking,\n* storage and filesystems,\n* programming models,\n* application development.\n\nThe class will be a mixture of lecturing and paper reviews (see PaperReviewing). We will read papers from the last couple of years describing the cloud computing technologies employed at Google, Yahoo!, Amazon, and Facebook. \n\nThe class will also include a project component. Project teams consisting of 2 or 3 students will pursue projects of their own choosing (with consultation of the instructor). Thanks to a generous $5000 grant from Amazon, free credits are available for the students to deploy hands-on experiments/services on the AWS cloud. \n\n!!Grading (Tentative)\n60% Project\n40% PaperReviewing and class participation\n
* Welcome to CS 643. Make sure to read the syllabus.\n* Some useful paper summaries http://muratbuffalo.blogspot.com/
[[Syllabus]] PapersToDiscuss [[Course Material]] [[Announcements]]\n[[Links]]
This semester we will discuss several papers. Each student will serve as a presenter for 2 papers (tentatively), as a note-taker for the same 2 papers, and as a participant for the remaining papers. I will evaluate the student for each role throughout the semester, and assign the S/U grade based on these performances and based on attendance. \n\nEach week we will review 2 research papers, sparing 1 hour for each paper. We will make heavy use of our Piazza course site. Here are the rules:\n\n* __By the morning (6am) of the day before the class__, each participant should have contributed 1 or at most 2 questions about the paper to our Piazza course site under the relevant entry. A participant should state a question that is not stated by another participant. The question should have some substance and depth, otherwise the participant will not be able to get any credit from the question.\n\n* The presenter will use 30 minutes to discuss the heart (the most important and useful part) of the paper. The presenter is allowed to use up to 10 slides. Slides should be in pdf format, and use large fonts. Avoid cramming text in the slides.\n\n* In the last 30 minutes, the presenter and the class will be answering the questions collected at the blog. The presenter should have a very good understanding of the paper, and should have read the relevant work mentioned in the paper (if needed) to be able to defend the paper. Although we will all chime in for discussion on some questions, the presenter should be competent enough to answer most of the questions.\n\n* After the class, the presenter is responsible for writing a review of the paper based on her notes and the discussion in the class. The review should follow these FormattingGuidelines.
!! 2 paragraphs for executive summary\n* what is the paper trying to do?\n* what is potential contribution of paper?\n* summary of strengths and weaknesses\n\n!! several paragraphs of details (listed in order of importance)\n* technical flaws?\n* structure of paper?\n* are key ideas brought out?\n* motivation and justification of approach -- why are these ideas important?\n* presentation? (ex: undefined terms, unclear sections…)\n* comparison with relevant work? \n\n!! questions and issues raised in the class \n* Include answers to the most interesting questions raised in the class\n* What other issues are raised?\n* How could you improve the paper?\n* Potential follow-up work, future work?\n\n(The format is borrowed from Ousterhout's advice for paper reviewing.) \n\n!!! Finish your review within 2 days of the class, and post your review in the course site.
[[Syllabus (pdf)|CS683/syllabus.pdf]]\n[[Hints for Computer System Design|CS683/1Hints.pdf]]\n[[Notes on CAP theorem|CS683/cap.pdf]]
!! Week 1\n* [[ Hint for Computer System Design, ACM OS'83 | http://research.microsoft.com/en-us/um/people/blampson/33-hints/Acrobat.pdf ]] :systems (Murat)\n* [[ Lessons from Giant-Scale Services, 2001 | http://www.cs.berkeley.edu/~brewer/papers/GiantScale-IEEE.pdf ]] :architecture (Murat)\n* [[ Above the Clouds: A Berkeley View of Cloud Computing, 2009 | http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf]]\n:concept (Murat Tuncer)\n\n!! Week 2\n* Project suggestions (Murat)\n* [[ The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines | http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006]] (you can exclude chapters 5 and 6)\n:architecture (Ata)\n* [[ Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM | http://www.stanford.edu/~ouster/cgi-bin/papers/ramcloud.pdf]]\n:architecture (Ismail)\n\n!! Week 3\n* [[ Big Data: Principles and best practices of scalable realtime data systems|http://www.manning.com/marz/BD_meap_ch01.pdf]] (Chapter 1)\n:architecture (Satiye)\n* __PROJECT DISCUSSIONS PROPOSALS__\n[[Internet entrepreneurship (Jeff Bezos)| http://www.youtube.com/watch?v=j_YdhnPH24E]]\nhttp://aws.amazon.com/documentation/\n\n\n!! Week 4\n* [[ Xen and the Art of Virtualization | http://dl.acm.org/citation.cfm?id=945462]]\n:virtualization (Gunduz)\n* [[ VL2: A Scalable and Flexible Data Center Network, 2009 | http://research.microsoft.com/pubs/80693/vl2-sigcomm09-final.pdf]] \n:networking (Mustafa Battal)\n* [[ Data Center TCP (DCTCP), 2010 | http://dl.acm.org/citation.cfm?id=1851192]]\n:networking (Berk)\n\n!! Week 5\n* CAP theorem: consistency, availability, partition-tolerance. Pick two.\n:CAP (Murat)\n* [[The NoSQL Ecosystem |http://www.aosabook.org/en/nosql.html ]]\n:NoSQL (Dogan)\n* [[ Life beyond Distributed Transactions: an Apostate's Opinion, 2007 | http://nosqlsummer.org/paper/life-beyond-distributed-transactions]]\n:NoSQL (Aytug)\n* [[ Design and Evaluation of a Continuous Consistency Model for Replicated Services, 2000|http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.34.7743&rep=rep1&type=pdf]]\n:NoSQL [optional]\n* http://www.julianbrowne.com/article/viewer/brewers-cap-theorem\n:cap [optional]\n\n!! Week 6 \n* [[ Scalable Distributed Data Structures, 2000 | http://dl.acm.org/citation.cfm?id=1251251]]\n:NoSQL (Bugra)\n* [[ Dynamo: amazon's highly available key-value store, 2007 | http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf]]\n:NoSQL (Can)\n* [[ Cassandra: a decentralized structured storage system, 2010 | http://dl.acm.org/citation.cfm?id=1773922]]\n:NoSQL (Tuncer)\n* [[ Eventually consistent, 2009 | http://dl.acm.org/citation.cfm?id=1435432]]\n:NoSQL [optional]\n* http://www.cs.berkeley.edu/~pbailis/projects/pbs/\n:cap [optional]\n\n!! Week 7\n* Paxos lectures \n:Consensus (Murat)\n* [[ The Chubby Lock Service for Loosely-Coupled Distributed Systems, 2006 | http://research.google.com/archive/chubby-osdi06.pdf]] \n:Consensus (Yakup)\n* [[ Chain replication for supporting high throughput and availability, 2004 | http://dl.acm.org/citation.cfm?id=1251261]]\n:Consensus (Cem)\n\n!! Week 8\n* [[ Optimistic Replication, 2005 | http://www.ysaito.com/survey.pdf]]\n:Storage (Kerem)\n* [[ The Google File System, 2003 | http://dl.acm.org/citation.cfm?id=945450 ]]\n:Storage (Emir)\n* [[ Bigtable: A Distributed Storage System for Structured Data, 2008| http://dl.acm.org/citation.cfm?id=1365816 ]]\n:Storage (Orhun)\n* Availability in Globally Distributed Storage Systems\n:Storage [optional]\n\n!! Week 9\n* __PROJECT PROGRESS PRESENTATIONS__\n\n!! Week 10\n* [[ MapReduce: simplified data processing on large clusters, 2008| http://dl.acm.org/citation.cfm?id=1327492 ]]\n:Systems (Mehmet Ali)\n* [[ Building a high-level dataflow system on top of Map-Reduce: the Pig experience, 2009 | http://dl.acm.org/citation.cfm?id=1687568 ]]\n:Systems (Onur)\n* [[ Pregel: a system for large-scale graph processing, 2010| http://dl.acm.org/citation.cfm?id=1807184 ]]\n:Systems (Reha)\n* CIEL: a universal execution engine for distributed data-flow computing, 2011\n:Systems [optional]\n\n!! Week 11\n* [[ A Comparison of Approaches to Large-Scale Data Analysis, 2009| http://dl.acm.org/citation.cfm?id=1559865 ]]\n:CloudDB (Nagehan)\n* [[HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads, 2009 | http://db.cs.yale.edu/hadoopdb/hadoopdb.pdf]]\n:CloudDB (Cagri)\n* [[ Boom analytics: exploring data-centric, declarative programming for the cloud, 2010 | http://dl.acm.org/citation.cfm?id=1755913.1755937 ]]\n:Systems (Sermetcan)\n* Efficient Processing of Data Warehousing Queries in a Split Execution Environment, 2011\n:CloudDB [optional]\n\n!! Week 12 \n* Maestro: A datacenter computing framework with automated locking, 2010\n:Systems (Murat)\n* [[ The Hadoop Distributed File System, 2010| http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5496972&tag=1 ]]\n:filesystem (Kemal)\n* [[ Ceph: A scalable, high-performance distributed file system, 2006| http://dl.acm.org/citation.cfm?id=1298485 ]]\n:filesystem (Saltuk Bugra)\n* Survey of Technologies for Wide Area Distributed Storage \n:filesystem [optional]\n\n!! Week 13\n\n* [[ PNUTS: Yahoo!'s Hosted Data Serving Platform, 2008| http://dl.acm.org/citation.cfm?id=1454167 ]]\n:WAN (Mahmut)\n* [[ Don't Settle for Eventual: Stronger Consistency for Wide-Area Storage with COPS, 2011 |http://www.cs.princeton.edu/~mfreed/docs/cops-sosp11.pdf]]\n:WAN (Burak)\n* [[ Transactional storage for geo-replicated systems, 2011| http://research.microsoft.com/en-us/people/aguilera/walter-sosp2011.pdf ]]\n:WAN (???)\n\n!! Week 14 \n* __PROJECT FINAL PRESENTATIONS__\n* [[ On designing and deploying Internet scale services | http://www.mvdirona.com/jrh/talksAndPapers/JamesRH_Lisa.pdf]]\n:systems (optional)\n* [[ Designs, lessons and advice from building large distributed systems| http://www.odbms.org/download/dean-keynote-ladis2009.pdf ]]\n:systems (optional)