The first summer education /research project jointly hosted by the Departments of Computer Science at Zhiyuan College and Cornell University was held July 2-30, 2014 on the Cornell campus. Twenty-seven rising seniors majoring in computer science participated. They attended two courses in computer science taught by professors in the department. In addition, 10 professors gave lectures in a variety of areas of interest to the students. Professors interacted closely with the students and were able to give them guidance in doing scientific research.

## Summer 2014 Program:

**Date:** 02/07/2014 -31/07/2014

**Number of Participants:** 27

### Courses:

- “Programming Languages” by David Gries and Michael Clarkson
- “How to Do Research” by John Hopcroft

Extracurricular activities: Tour to Niagara Falls, ice cream socials, picnics at Cayuga Lake, pizza parties at the homes of Professor Gries and Professor Hopcroft. Some students traveled to NYC on their own.

Accommodations: Alice Cook House (undergrad dorm on campus)

This one month summer program is open to juniors majoring in computer science. International airfare, housing, academic arrangements and a travel tour are covered for each participant by funds from Zhiyuan College. Students are responsible for their own meals.

**Academic Reports:**

**2:00pm, July 2: Majority is not Enough: Bitcoin Mining is Vulnerable by EMİN GÜN SİRER**

- The Bitcoin cryptocurrency records its transactions in a public log called the blockchain. Its security rests critically on the distributed protocol that maintains the blockchain, run by participants called miners. Conventional wisdom asserts that the protocol is incentive-compatible and secure against colluding minority groups, i.e., it incentivizes miners to follow the protocol as prescribed.
- We show that the Bitcoin protocol is not incentive-compatible. We present an attack with which colluding miners obtain a revenue larger than their fair share. This attack can have significant consequences for Bitcoin: Rational miners will prefer to join the selfish miners, and the colluding group will increase in size until it becomes a majority. At this point, the Bitcoin system ceases to be a decentralized currency.
- Selfish mining is feasible for any group size of colluding miners. We propose a practical modification to the Bitcoin protocol that protects against selfish mining pools that command less than 1/4 of the resources. This threshold is lower than the wrongly assumed 1/2 bound, but better than the current reality where a group of any size can compromise the system.

**9:00am, July 3: NetKAT: Semantic Foundations for Networks by Dexter Kozen**

- Recent years have seen growing interest in high-level languages for programming networks. But the design of these languages has been largely ad hoc, driven more by the needs of applications and the capabilities of network hardware than by foundational principles. The lack of a semantic foundation has left language designers with little guidance in determining how to incorporate new features, and programmers without a means to reason precisely about their code.
- This paper presents NetKAT, a new network programming lan- guage that is based on a solid mathematical foundation and comes equipped with a sound and complete equational theory. We describe the design of NetKAT, including primitives for filtering, modifying, and transmitting packets; union and sequential composition oper- ators; and a Kleene star operator that iterates programs. We show that NetKAT is an instance of a canonical and well-studied mathe- matical structure called a Kleene algebra with tests (KAT) and prove that its equational theory is sound and complete with respect to its denotational semantics. Finally, we present practical applications of the equational theory including syntactic techniques for checking reachability, proving non-interference properties that ensure isola- tion between programs, and establishing the correctness of compi- lation algorithms.

**4:00pm, July 10: The Role of Type Theory in the Use of Proof Assistants for Programming, Software Engineering, and Computing Theory by Bob Constable**

- An interesting trend in computer science research in Europe and the United States is the increasing use of proof assistants. These are computing systems that help people solve mathematical problems, including writing formal proofs that programs meet their specifications.
- These proof assistants are also being used by mathematicians and theoretical computer scientists to formally prove difficult theorems.
- The most widely used proof assistants are Coq and HOL. The Coq proof assistant is closely related to the Nuprl proof assistant built at Cornell and supported here. All of these proof assistants are based on type theory as their mathematical language, rather than set theory.
- This lecture will explain elements of type theory and show how it is used to specify programs and prove that programs meet these specifications. We will also illustrate how type theory is used in mathematics.

**2:00pm, July 15 - 17 Ken Birman**

- He introduced smart power grid. Smart power grid combines system and machine learning together. Ken aims at building a system for the smart grid so that it can work as an iPad. We can add sensors everywhere to monitor the global state of the grid. Then software for better arranging the power can be added to the system like apps to the iPad.
- Some puzzles in the system world such as two generals and muddy children.
- Introduction to the smart meter at home. He talked about what applications these meters can have and how they can improve our life.

**4:00pm, July 15 Doug James**

Normally, the sound in the cartoon are wrong. To produce the sound, people normally just record the sound that they think the true sound is like to be. So Doug James focuses on how to generate the sound better. He presented a lot of projects he has done, like how to generate the sound of water, fire and how to speed them up, how to generate the sound of broken glass which involves plenty of objects, how to model the sound of clothing and so on.

**11:00am, July 23 Thorsten**

Say we have two ranking function for some search engine. When we do experiments to test which function is better, we usually randomly divide people into 2 groups and use some features like #of clicks on the top 1 item to judge. However, this method has two problems. 1. Thorsten’s group has done some experiments and showed that none of the previous metrics truly shows whether a rank function is better than the other. 2. The previous method is silly because it’s just like divide people into 2 groups and ask one group how much they like Coca Cola and ask the other group how much they like Pepsi. What we should do is to ask each person which they like more, Coca Cola or Pepsi. Therefore, they interleaved the results generated by the two functions and showed the combined results to the users. Then they measure something like how many clicks each function receives to compare them. Their experiments show that this method truly shows which function is better. Moreover, based on this interleaving result method, they derived an algorithm to quickly find the best rank function by pairwise comparison only.

**11:00am, July 25 Kavita Bala**

She works on graphics and introduced 4 main projects. The first two are in rendering. Previously, people need to develop a new model whenever they encounter a new material. Her group developed a method that can systematically handle most kinds of clothing including velvet, silk, wool and so on. It first uses CT scan to obtain the 3d model of the material, then it adjusts some parameters to let the material look like the true material. Finally, it obtains the final rendering result. Since the rendering involves too many details, it’s very slow. So they further proposed a new method to speed it up. The last two projects is on building a database for materials. We now have large datasets for images but not materials. To enhance research on material, they developed the first large-scale database for materials that can extracts the surface of a wood floor, for example, and can obtain its color information, reflectance information, etc. In the fourth project, they build a database which provides pairwise comparison between the true color of the material (disregard of the shading).

**2:00pm, July 28 Jon Kleinberg**

In social network, people often share images, share blogs and share everything. Some information is widely spread while others are not. When we closely observe the graph of how the message is shared, we can find some structure in the graph that leads to this difference. Therefore, he design a method to predict how well some information will be spread (cascade) given the “share graph” so far. Secondly, we often use the metric - how many friends two people share - to measure how closely related the two people are. However, it’s not often the case. When two people are coworkers, even if they are not good friends, they share a lot of coworkers as friends. So he introduced another metric to measure the closeness of two people. And it showed good performance in predicting a spouse.

**11:00am, July 29 Charles F. Van Loan**

We have done much study on the matrix. However, nowadays matrix is not enough and people often have to deal with tensors (A tensor is just a multidimensional array and a matrix is a 2-dim tensor). He introduced how people can analyze a tensor and how symmetry in multi-dimension can lead to interesting properties when we unfold a 4-d tensor into a 2-d matrix.

*The above reports are edited by Xuan Luo, Kuan Yang, Zhipeng Chen*

**My Cornell Research Journey (2011 ACM Computer Science Class) **

**Colorful Life:**