Category Archives: Computer

Toward practical quantum

Quantum computers are largely hypothetical devices that could perform some calculations much more rapidly than conventional computers can. Instead of the bits of classical computation, which can represent 0 or 1, quantum computers consist of quantum bits, or qubits, which can, in some sense, represent 0 and 1 simultaneously.

Although quantum systems with as many as 12 qubits have been demonstrated in the lab, building quantum computers complex enough to perform useful computations will require miniaturizing qubit technology, much the way the miniaturization of transistors enabled modern computers.

Trapped ions are probably the most widely studied qubit technology, but they’ve historically required a large and complex hardware apparatus. In today’s Nature Nanotechnology, researchers from MIT and MIT Lincoln Laboratory report an important step toward practical quantum computers, with a paper describing a prototype chip that can trap ions in an electric field and, with built-in optics, direct laser light toward each of them.

“If you look at the traditional assembly, it’s a barrel that has a vacuum inside it, and inside that is this cage that’s trapping the ions. Then there’s basically an entire laboratory of external optics that are guiding the laser beams to the assembly of ions,” says Rajeev Ram, an MIT professor of electrical engineering and one of the senior authors on the paper. “Our vision is to take that external laboratory and miniaturize much of it onto a chip.”

Caged in

The Quantum Information and Integrated Nanosystems group at Lincoln Laboratory was one of several research groups already working to develop simpler, smaller ion traps known as surface traps. A standard ion trap looks like a tiny cage, whose bars are electrodes that produce an electric field. Ions line up in the center of the cage, parallel to the bars. A surface trap, by contrast, is a chip with electrodes embedded in its surface. The ions hover 50 micrometers above the electrodes.

Cage traps are intrinsically limited in size, but surface traps could, in principle, be extended indefinitely. With current technology, they would still have to be held in a vacuum chamber, but they would allow many more qubits to be crammed inside.

“We believe that surface traps are a key technology to enable these systems to scale to the very large number of ions that will be required for large-scale quantum computing,” says Jeremy Sage, who together with John Chiaverini leads Lincoln Laboratory’s trapped-ion quantum-information-processing project. “These cage traps work very well, but they really only work for maybe 10 to 20 ions, and they basically max out around there.”

Performing a quantum computation, however, requires precisely controlling the energy state of every qubit independently, and trapped-ion qubits are controlled with laser beams. In a surface trap, the ions are only about 5 micrometers apart. Hitting a single ion with an external laser, without affecting its neighbors, is incredibly difficult; only a few groups had previously attempted it, and their techniques weren’t  practical for large-scale systems.

Getting onboard

That’s where Ram’s group comes in. Ram and Karan Mehta, an MIT graduate student in electrical engineering and first author on the new paper, designed and built a suite of on-chip optical components that can channel laser light toward individual ions. Sage, Chiaverini, and their Lincoln Lab colleagues Colin Bruzewicz and Robert McConnell retooled their surface trap to accommodate the integrated optics without compromising its performance. Together, both groups designed and executed the experiments to test the new system.

“Typically, for surface electrode traps, the laser beam is coming from an optical table and entering this system, so there’s always this concern about the beam vibrating or moving,” Ram says. “With photonic integration, you’re not concerned about beam-pointing stability, because it’s all on the same chip that the electrodes are on. So now everything is registered against each other, and it’s stable.”

The researchers’ new chip is built on a quartz substrate. On top of the quartz is a network of silicon nitride “waveguides,” which route laser light across the chip. Above the waveguides is a layer of glass, and on top of that are niobium electrodes with tiny holes in them to allow light to pass through. Beneath the holes in the electrodes, the waveguides break into a series of sequential ridges, a “diffraction grating” precisely engineered to direct light up through the holes and concentrate it into a beam narrow enough that it will target a single ion, 50 micrometers above the surface of the chip.


With the prototype chip, the researchers were evaluating the performance of the diffraction gratings and the ion traps, but there was no mechanism for varying the amount of light delivered to each ion. In ongoing work, the researchers are investigating the addition of light modulators to the diffraction gratings, so that different qubits can simultaneously receive light of different, time-varying intensities. That would make programming the qubits more efficient, which is vital in a practical quantum information system, since the number of quantum operations the system can perform is limited by the “coherence time” of the qubits.

“As far as I know, this is the first serious attempt to integrate optical waveguides in the same chip as an ion trap, which is a very significant step forward on the path to scaling up ion-trap quantum information processors [QIP] to the sort of size which will ultimately contain the number of qubits necessary for doing useful QIP,” says David Lucas, a professor of physics at Oxford University. “Trapped-ion qubits are well-known for being able to achieve record-breaking coherence times and very precise operations on small numbers of qubits. Arguably, the most important area in which progress needs to be made is technologies which will enable the systems to be scaled up to larger numbers of qubits. This is exactly the need being addressed so impressively by this research.”

“Of course, it’s important to appreciate that this is a first demonstration,” Lucas adds. “But there are good prospects for believing that the technology can be improved substantially. As a first step, it’s a wonderful piece of work.”

First major database

After thousands of hours of work, MIT researchers have released the first major database of fully annotated English sentences written by non-native speakers.

The researchers who led the project had already shown that the grammatical quirks of non-native speakers writing in English could be a source of linguistic insight. But they hope that their dataset could also lead to applications that would improve computers’ handling of spoken or written language of non-native English speakers.

“English is the most used language on the Internet, with over 1 billion speakers,” says Yevgeni Berzak, a graduate student in electrical engineering and computer science, who led the new project. “Most of the people who speak English in the world or produce English text are non-native speakers. This characteristic is often overlooked when we study English scientifically or when we do natural-language processing for English.”

Most natural-language-processing systems, which enable smartphone and other computer applications to process requests phrased in ordinary language, are based on machine learning, in which computer systems look for patterns in huge sets of training data. “If you want to handle noncanonical learner language, in terms of the training material that’s available to you, you can only train on standard English,” Berzak explains.

Systems trained on nonstandard English, on the other hand, could be better able to handle the idiosyncrasies of non-native English speakers, such as tendencies to drop or add prepositions, to substitute particular tenses for others, or to misuse particular auxiliary verbs. Indeed, the researchers hope that their work could lead to grammar-correction software targeted to native speakers of other languages.

Diagramming sentences

The researchers’ dataset consists of 5,124 sentences culled from exam essays written by students of English as a second language (ESL). The sentences were drawn, in approximately equal distribution, from native speakers of 10 languages that are the primary tongues of roughly 40 percent of the world’s population.

Every sentence in the dataset includes at least one grammatical error. The original source of the sentences was a collection made public by Cambridge University, which included annotation of the errors, but no other grammatical or syntactic information.

To provide that additional information, Berzak recruited a group of MIT undergraduate and graduate students from the departments of Electrical Engineering and Computer Science (EECS), Linguistics, and Mechanical Engineering, led by Carolyn Spadine, a graduate student in linguistics.

After eight weeks of training in how to annotate both grammatically correct and error-ridden sentences, the students began working directly on the data. There are three levels of annotation. The first involves basic parts of speech — whether a word is a noun, a verb, a preposition, and so on. The next is a more detailed description of parts of speech — plural versus singular nouns, verb tenses, comparative and superlative adjectives, and the like.

Next, the annotators charted the syntactic relationships between the words of the sentences, using a relatively new annotation scheme called the Universal Dependency formalism. Syntactic relationships include things like which nouns are the objects of which verbs, which verbs are auxiliaries of other verbs, which adjectives modify which nouns, and so on.

The annotators created syntactic charts for both the corrected and uncorrected versions of each sentence. That required some prior conceptual work, since grammatical errors can make words’ syntactic roles difficult to interpret.

Berzak and Spadine wrote a 20-page guide to their annotation scheme, much of which dealt with the handling of error-ridden sentences. Consistency in the treatment of such sentences is essential to any envisioned application of the dataset: A machine-learning system can’t learn to recognize an error if the error is described differently in different training examples.

Repeatable results

The researchers’ methodology, however, provides good evidence that annotators can chart ungrammatical sentences consistently. For every sentence, one evaluator annotated it completely; another reviewed the annotations and flagged any areas of disagreement; and a third ruled on the disagreements.

There was some disagreement on how to handle ungrammatical sentences — but there was some disagreement on how to handle grammatical sentences, too. In general, levels of agreement were comparable for both types of sentences.

The researchers report these and other results in a paper being presented at the Association for Computational Linguistics annual conference in August. Joining Berzak and Spadine on the paper are Boris Katz, who is Berzak’s advisor and a principal research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory; and the undergraduate annotators:Jessica Kenney, Jing Xian Wang, Lucia Lam, Keiko Sophie Mori, and Sebastian Garza.

The researchers’ dataset is now one of the 59 datasets available from the organization that oversees the Universal Dependency (UD) standard. Berzak also created an online interfacefor the dataset, so that researchers can look for particular kinds of errors, in sentences produced by native speakers of particular languages, and the like.

“What I find most interesting about the ESL [dataset] is that the use of UD opens up a lot of possibilities for systematically comparing the ESL data not only to native English but also to other languages that have corpora annotated using UD,” says Joakim Nivre, a professor of computational linguistics at Uppsala University in Sweden and one of the developers of the UD standard. “Hopefully, other ESL researchers will follow their example, which will enable further comparisons along several dimensions, ESL to ESL, ESL to native, et cetera.”

“The decision to annotate both incorrect and corrected sentences makes the material very valuable,” Nivre adds. “I can see, for example, how this could be cast as a machine translation task, where the system learns to translate from ESL to English. The current corpus would essentially provide the parallel data necessary to train such a system, and the availability of syntactic annotation for both sides opens up more diverse technical approaches.”

The work was funded in part by the National Science Foundation, under the auspices of MIT’s Center for Brains, Minds, and Machines.

Bug Finder Automatics

Symbolic execution is a powerful software-analysis tool that can be used to automaticallylocate and even repair programming bugs. Essentially, it traces out every path that a program’s execution might take.

But it tends not to work well with applications written using today’s programming frameworks. An application might consist of only 1,000 lines of new code, but it will generally import functions — such as those that handle virtual buttons — from a programming framework, which includes huge libraries of frequently reused code. The additional burden of evaluating the imported code makes symbolic execution prohibitively time consuming.

Computer scientists address this problem by creating simple models of the imported libraries, which describe their interactions with new programs but don’t require line-by-line evaluation of their code. Building the models, however, is labor-intensive and error prone, and the models require regular updates, as programming frameworks are constantly evolving.

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory, working with colleagues at the University of Maryland, have taken an important step toward enabling symbolic execution of applications written using programming frameworks, with a system that automatically constructs models of framework libraries.

The researchers compared a model generated by their system with a widely used model of Java’s standard library of graphical-user-interface components, which had been laboriously constructed over a period of years. They found that their new model plugged several holes in the hand-coded one.

They described their results in a paper they presented last week at the International Conference on Software Engineering. Their work was funded by the National Science Foundation’s Expeditions Program.

“Forty years ago, if you wanted to write a program, you went in, you wrote the code, and basically all the code you wrote was the code that executed,” says Armando Solar-Lezama, an associate professor of electrical engineering and computer science at MIT, whose group led the new work. “But today, if you want to write a program, you go and bring in these huge frameworks and these huge pieces of functionality that you then glue together, and you write a little code to get them to interact with each other. If you don’t understand what that big framework is doing, you’re not even going to know where your program is going to start executing.”

Consequently, a program analyzer can’t just dispense with the framework code and concentrate on the newly written code. But symbolic execution works by stepping through every instruction that a program executes for a wide range of input values. That process becomes untenable if the analyzer has to evaluate every instruction involved in adding a virtual button to a window — the positioning of the button on the screen, the movement of the button when the user scrolls down the window, the button’s change of appearance when it’s pressed, and so on.

For purposes of analysis, all that matters is what happens when the button is pressed, so that’s the only aspect of the button’s functionality that a framework model needs to capture. More precisely, the model describes only what happens when code imported from a standard programming framework returns control of a program to newly written code.

“The only thing we care about is what crosses the boundary between the application and the framework,” says Xiaokang Qiu, a postdoc in Solar-Lezama’s lab and a co-author on the new paper. “The framework itself is like a black box that we want to abstract away.”

To generate their model, the researchers ran a suite of tutorials designed to teach novices how to program in Java. Their system automatically tracked the interactions between the tutorial code and the framework code that the tutorials imported.

“The nice thing about tutorials is that they’re designed to help people understand how the framework works, so they’re also a good way to teach the synthesizer how the framework works,” Solar-Lezama says. “The problem is that if I just show you a trace of what my program did, there’s an infinite set of programs that could behave like that trace.”

To winnow down that set of possibilities, the researchers’ system tries to fit the program traces to a set of standard software “design patterns.” First proposed in the late 1970s and popularized in a 1995 book called “Design Patterns,” design patterns are based on the idea that most problems in software engineering fit into just a few categories, and their solutions have just a few general shapes.

Computer scientists have identified roughly 20 design patterns that describe communication between different components of a computer program. Solar-Lezama, Qiu, and their Maryland colleagues — Jinseong Jeon, Jonathan Fetter-Degges, and Jeffrey Foster — built four such patterns into their new system, which they call Pasket, for “pattern sketcher.” For any given group of program traces, Pasket tries to fit it to each of the design patterns, selecting only the one that works best.

Because a given design pattern needs to describe solutions to a huge range of problems that vary in their particulars, in the computer science literature, they’re described in very general terms. Fortunately, Solar-Lezama has spent much of his career developing a system, called Sketch, that takes general descriptions of program functionality and fills in the low-level computational details. Sketch is the basis of most of his group’s originalresearch, and it’s what reconciles design patterns and program traces in Pasket.

“The availability of models for frameworks such as Swing [Java’s library of graphical-user-interface components] and Android is critical for enabling symbolic execution of applications built using these frameworks,” says Rajiv Gupta, a professor of computer science and engineering at the University of California at Riverside. “At present, framework models are developed and maintained manually. This work offers a compelling demonstration of how far synthesis technology has advanced. The scalability of Pasket is impressive — in a few minutes, it synthesized nearly 2,700 lines of code. Moreover, the generated models compare favorably with manually created ones.”

Speedy Terahertz-Based May Detect Explosives

Terahertz spectroscopy, which uses the band of electromagnetic radiation between microwaves and infrared light, is a promising security technology because it can extract the spectroscopic “fingerprints” of a wide range of materials, including chemicals used in explosives.

But traditional terahertz spectroscopy requires a radiation source that’s heavy and about the size of a large suitcase, and it takes 15 to 30 minutes to analyze a single sample, rendering it impractical for most applications.

In the latest issue of the journal Optica, researchers from MIT’s Research Laboratory of Electronics and their colleagues present a new terahertz spectroscopy system that uses a quantum cascade laser, a source of terahertz radiation that’s the size of a computer chip. The system can extract a material’s spectroscopic signature in just 100 microseconds.

The device is so efficient because it emits terahertz radiation in what’s known as a “frequency comb,” meaning a range of frequencies that are perfectly evenly spaced.

“With this work, we answer the question, ‘What is the real application of quantum-cascade laser frequency combs?’” says Yang Yang, a graduate student in electrical engineering and computer science and first author on the new paper. “Terahertz is such a unique region that spectroscopy is probably the best application. And QCL-based frequency combs are a great candidate for spectroscopy.”

Different materials absorb different frequencies of terahertz radiation to different degrees, giving each of them a unique terahertz-absorption profile. Traditionally, however, terahertz spectroscopy has required measuring a material’s response to each frequency separately, a process that involves mechanically readjusting the spectroscopic apparatus. That’s why the method has been so time consuming.

Because the frequencies in a frequency comb are evenly spaced, however, it’s possible to mathematically reconstruct a material’s absorption fingerprint from just a few measurements, without any mechanical adjustments.

Getting even

The trick is evening out the spacing in the comb. Quantum cascade lasers, like all electrically powered lasers, bounce electromagnetic radiation back and forth through a “gain medium” until the radiation has enough energy to escape. They emit radiation at multiple frequencies that are determined by the length of the gain medium.

But those frequencies are also dependent on the medium’s refractive index, which describes the speed at which electromagnetic radiation passes through it. And the refractive index varies for different frequencies, so the gaps between frequencies in the comb vary, too.

To even out their lasers’ frequencies, the MIT researchers and their colleagues use an oddly shaped gain medium, with regular, symmetrical indentations in its sides that alter the medium’s refractive index and restore uniformity to the distribution of the emitted frequencies.

Yang; his advisor, Qing Hu, the Distinguished Professor in Electrical Engineering and Computer Science; and first author David Burghoff, who received his PhD in electrical engineering and computer science from MIT in 2014 and is now a research scientist in Hu’s group, reported this design in Nature Photonics in 2014. But while their first prototype demonstrated the design’s feasibility, it in fact emitted two frequency combs, clustered around two different central frequencies, with a gap between them, which made it less than ideal for spectroscopy.

In the new work, Yang and Burghoff, who are joint first authors; Hu; Darren Hayton and Jian-Rong Gao of the Netherlands Institute for Space Research; and John Reno of Sandia National Laboratories developed a new gain medium that produces a single, unbroken frequency comb. Like the previous gain medium, the new one consists of hundreds of alternating layers of gallium arsenide and aluminum gallium arsenide, with different but precisely calibrated thicknesses.

Getting practical

As a proof of concept, the researchers used their system to measure the spectral signature of not a chemical sample but an optical device called an etalon, made from a wafer of gallium arsenide, whose spectral properties could be calculated theoretically in advance, providing a clear standard of comparison. The new system’s measurements were a very good fit for the etalon’s terahertz-transmission profile, suggesting that it could be useful for detecting chemicals.

Although terahertz quantum cascade lasers are of chip scale, they need to be cooled to very low temperatures, so they require refrigerated housings that can be inconveniently bulky. Hu’s group continues to work on the design of increasingly high-temperature quantum cascade lasers, but in the new paper, Yang and his colleagues demonstrated that they could extract a reliable spectroscopic signature from a target using only very short bursts of terahertz radiation. That could make terahertz spectroscopy practical even at low temperatures.

“We used to consume 10 watts, but my laser turns on only 1 percent of the time, which significantly reduces the refrigeration constraints,” Yang explains. “So we can use compact-sized cooling.”

“This paper is a breakthrough, because these kinds of sources were not available in terahertz,” says Gerard Wysocki, an assistant professor of electrical engineering at Princeton University. “Qing Hu is the first to actually present terahertz frequency combs that are semiconductor devices, all integrated, which promise very compact broadband terahertz spectrometers.”

“Because they used these very inventive phase correction techniques, they have demonstrated that even with pulsed sources you can extract data that is reasonably high resolution already,” Wysocki continues. “That’s a technique that they are pioneering, and this is a great first step toward chemical sensing in the terahertz region.”

Lincoln Laboratory supercomputing center

MIT Lincoln Laboratory has been a world leader in interactive supercomputing since the 1950s. In 1955, TX-0, the first fully transistor-based computer, was built to support a wide range of research at the laboratory and the MIT campus, and became the basis for the second-largest computing company in the world, Digital Equipment Corporation. In 2001, the laboratory developed Parallel Matlab, which enabled thousands of researchers worldwide to use interactive supercomputing for high-performance data analysis. In 2008, the laboratory demonstrated the largest single problem ever run on a computer, using its TX-2500 supercomputer, a part of the system called LLGrid. In April, the laboratory acknowledged the importance of the LLGrid world-class computing capability with the establishment of the Lincoln Laboratory Supercomputing Center (LLSC).

LLSC is based, in part, on the LLGrid infrastructure, but was developed to enhance computing power and accessibility to more than 1,000 researchers across the Institute. “By establishing the LLSC, Lincoln Laboratory will be able to better address supercomputing needs across all Laboratory missions, develop new supercomputing capabilities and technologies, and spawn even closer collaborations with MIT campus supercomputing initiatives,” says Jeremy Kepner, laboratory fellow, and head of the LLSC. “These brilliant engineers, scientists, faculty, and students use our capabilities to conduct research in diverse fields such as space observations, robotic vehicles, communications, cybersecurity, machine learning, sensor processing, electronic devices, bioinformatics, and air traffic control.”

Only 13 years ago, the laboratory’s supercomputing capability, LLGrid, was composed of a single 16-processor system. Albert Reuther, manager of LLSC, says that a “different kind of supercomputing” was clearly needed to meet the needs of laboratory researchers. Since then, the capability has expanded to thousands of processors across several systems. In addition, Reuther says that the center differs from others like it because of the team’s “focus on interactive supercomputing for high-performance data analysis,” and the “extremely ‘green’ computing center in Holyoke, Massachusetts, which allows our computers to run 93 percent carbon-free.”

“This new level of supercomputing capability will be a key technology for the computational fluid dynamics (CFD) work performed in the Structural and Thermal-Fluids Engineering Group,” says Nathan J. Falkiewicz. Falkiewicz explains that the new capability will allow his team to take advantage of the parallelism inherent in existing CFD codes to significantly reduce simulation time for computationally taxing problems, as well as enable simulation for certain classes of problems that would otherwise have “prohibitively long” execution times without access to large core-count, high-performance computing clusters.

Orion S. Crisafulli of the Active Optical Systems Group says that the supercomputing capability has enabled his team, in collaboration with MIT campus, to run complex simulations in the performance investigation of a compact microlidar system. “Access to a large number of compute nodes, each with substantial memory and a streamlined job submission process, has shortened the run time for our simulations from a week to a few hours,” Crisafulli says. “This allows us to explore a significantly larger system parameter space than we would otherwise be able to, and ultimately achieve a more complete understanding of the capabilities of the microlidar system concept.”

Reuther says that the LLSC exists today in large part because of the researchers who utilize supercomputing capabilities to produce cutting-edge research results, as well as many other supporters: “LLSC has been blessed to have the support of visionaries in the Director’s Office, the Technology Office, and the Steering Committee who have seen the potential of supercomputing to enable all of the Laboratory’s missions.” Reuther also credits the MIT Lincoln Laboratory Beaver Works Center for playing a critical role in the LLSC’s collaborations with campus.

“Creating the Lincoln Laboratory Supercomputing Center has been a goal for the team for many years, and it is tremendously rewarding to see it come to fruition,” Kepner says. “Laboratory researchers will see continued improvement in the LLSC systems, MIT Campus will benefit from our unique interactive supercomputing technologies, and Laboratory and campus researchers will be able to collaborate more closely on their joint research projects.”

Patching up Web applications

By exploiting some peculiarities of the popular Web programming framework Ruby on Rails, MIT researchers have developed a system that can quickly comb through tens of thousands of lines of application code to find security flaws.

In tests on 50 popular Web applications written using Ruby on Rails, the system found 23 previously undiagnosed security flaws, and it took no more than 64 seconds to analyze any given program.

The researchers will present their results at the International Conference on Software Engineering, in May.

According to Daniel Jackson, professor in the Department of Electrical Engineering and Computer Science, the new system uses a technique called static analysis, which seeks to describe, in a very general way, how data flows through a program.

“The classic example of this is if you wanted to do an abstract analysis of a program that manipulates integers, you might divide the integers into the positive integers, the negative integers, and zero,” Jackson explains. The static analysis would then evaluate every operation in the program according to its effect on integers’ signs. Adding two positives yields a positive; adding two negatives yields a negative; multiplying two negatives yields a positive; and so on.

“The problem with this is that it can’t be completely accurate, because you lose information,” Jackson says. “If you add a positive and a negative integer, you don’t know whether the answer will be positive, negative, or zero. Most work on static analysis is focused on trying to make the analysis more scalable and accurate to overcome those sorts of problems.”

With Web applications, however, the cost of accuracy is prohibitively high, Jackson says. “The program under analysis is just huge,” he says. “Even if you wrote a small program, it sits atop a vast edifice of libraries and plug-ins and frameworks. So when you look at something like a Web application written in language like Ruby on Rails, if you try to do a conventional static analysis, you typically find yourself mired in this huge bog. And this makes it really infeasible in practice.”

That vast edifice of libraries, however, also gave Jackson and his former student Joseph Near, who graduated from MIT last spring and is now doing a postdoc at the University of California at Berkeley, a way to make to make static analysis of programs written in Ruby on Rails practical.

A library is a compendium of code that programmers tend to use over and over again. Rather than rewriting the same functions for each new program, a programmer can just import them from a library.

Ruby on Rails — or Rails, as it’s called for short — has the peculiarity of defining even its most basic operations in libraries. Every addition, every assignment of a particular value to a variable, imports code from a library.

Near rewrote those libraries so that the operations defined in them describe their own behavior in a logical language. That turns the Rails interpreter, which converts high-level Rails programs into machine-readable code, into a static-analysis tool. With Near’s libraries, running a Rails program through the interpreter produces a formal, line-by-line description of how the program handles data.

In his PhD work, Near used this general machinery to build three different debuggers for Ruby on Rails applications, each requiring different degrees of programmer involvement. The one described in the new paper, which the researchers call Space, evaluates a program’s data access procedures.

Near identified seven different ways in which Web applications typically control access to data. Some data are publicly available, some are available only to users who are currently logged in, some are private to individual users, some users — administrators — have access to select aspects of everyone’s data, and so on.

For each of these data-access patterns, Near developed a simple logical model that describes what operations a user can perform on what data, under what circumstances. From the descriptions generated by the hacked libraries, Space can automatically determine whether the program adheres to those models. If it doesn’t, there’s likely to be a security flaw.

Using Space does require someone with access to the application code to determine which program variables and functions correspond to which aspects of Near’s models. But that isn’t an onerous requirement: Near was able to map correspondences for all 50 of the applications he evaluated. And that mapping should be even easier for a programmer involved in an application’s development from the outset, rather than coming to it from the outside as Near did.

How to Secure User Controlled Data

Most people with smartphones use a range of applications that collect personal information and store it on Internet-connected servers — and from their desktop or laptop computers, they connect to Web services that do the same. Some use still other Internet-connected devices, such as thermostats or fitness monitors, that also store personal data online.

Generally, users have no idea which data items their apps are collecting, where they’re stored, and whether they’re stored securely. Researchers at MIT and Harvard University hope to change that, with an application they’re calling Sieve.

With Sieve, a Web user would store all of his or her personal data, in encrypted form, on the cloud. Any app that wanted to use specific data items would send a request to the user and receive a secret key that decrypted only those items. If the user wanted to revoke the app’s access, Sieve would re-encrypt the data with a new key.

“This is a rethinking of the Web infrastructure,” says Frank Wang, a PhD student in electrical engineering and computer science and one of the system’s designers. “Maybe it’s better that one person manages all their data. There’s one type of security and not 10 types of security. We’re trying to present an alternative model that would be beneficial to both users and applications.”

The researchers are presenting Sieve at the USENIX Symposium on Networked Systems Design and Implementation this month. Wang is the first author, and he’s joined by MIT associate professors of electrical engineering and computer science Nickolai Zeldovich and Vinod Vaikuntanathan, who is MIT’s Steven and Renee Finn Career Development Professor, and by James Mickens, an associate professor of computer science at Harvard University.

Selective disclosure

Sieve required the researchers to develop practical versions of two cutting-edge cryptographic techniques called attribute-based encryption and key homomorphism.With attribute-based encryption, data items in a file are assigned different labels, or “attributes.” After encryption, secret keys can be generated that unlock only particular combinations of attributes: name and zip code but not street name, for instance, or zip code and date of birth but not name.

The problem with attribute-based encryption — and decryption — is that it’s slow. To get around that, the MIT and Harvard researchers envision that Sieve users would lump certain types of data together under a single attribute. For instance, a doctor might be interested in data from a patient’s fitness-tracking device but probably not in the details of a single afternoon’s run. The user might choose to group fitness data by month.

This introduces problems of its own, however. A fitness-tracking device probably wants to store data online as soon as the data is generated, rather than waiting until the end of the month for a bulk upload. But data uploaded to the cloud yesterday could end up in a very different physical location than data uploaded by the same device today.

So Sieve includes tables that track the locations at which grouped data items are stored in the cloud. Each of those tables is encrypted under a single attribute, but the data they point to are encrypted using standard — and more efficient — encryption algorithms. As a consequence, the size of the data item encrypted through attribute-based encryption — the table — is fixed, which makes decryption more efficient.

In experiments, the researchers found that decrypting a month’s worth of, say, daily running times grouped under a single attribute would take about 1.5 seconds, whereas if each day’s result was encrypted under its own attribute, decrypting a month’s worth would take 15 seconds.

Wang developed an interface that displays a Sieve user’s data items as a list and allows the user to create and label icons that represent different attributes. Dragging a data item onto an icon assigns it that attribute. At the moment, the interface is not particularly user friendly, but its purpose is to show that the underlying encryption machinery works properly.

Blind manipulation

Key homomorphism is what enables Sieve to revoke an app’s access to a user’s data. With key homomorphism, the cloud server can re-encrypt the data it’s storing without decrypting it first — or without sending it to the user for decryption, re-encryption, and re-uploading. In this case, the researchers had to turn work that was largely theoretical into a working system.

“All these things in cryptography are very vague,” Wang says. “They say, ‘Here’s an algorithm. Assume all these complicated math things.’ But in reality, how do I build this? They’re like, ‘Oh, this group has this property.’ But they don’t tell you what the group is. Are they numbers? Are they primes? Are they elliptic curves? It took us a month or so to wrap our heads around what we needed to do to get this to work.”

Of course, a system like Sieve requires the participation of app developers. But it could work to their advantage. A given application might provide more useful services if it had access to data collected by other devices. And were a system like Sieve commercially deployed, applications could distinguish themselves from their competitors by advertising themselves as Sieve-compliant.

“Privacy is increasing in importance and the debate between Apple’s iPhone encryption and the FBI is a good example of that,” says Engin Kirda, a professor of electrical and computer engineering at Northeastern University. “I think a lot of users would appreciate having cryptographic control over their own data.”

“I think the real innovation is how they use attribute-based encryption smartly, and make it usable in practice,” Kirda adds. “They show that it is possible to have private clouds where the users have real privacy control over their data.”

Protecting bulk power systems from hackers


Advances in smart grid technology — such as smart meters in homes, management systems for distributed energy resources like wind and solar production along with instrumentation systems in power plants, substations or control centers — create both improvements in monitoring and entry points for hackers.

“Ten years ago, cybersecurity simply didn’t exist — it wasn’t talked about and it wasn’t a problem,” Ten says, joking that people thought he was crazy for suggesting power grid hacking was possible. “Now with events like in Ukraine last year and malware like Stuxnet, where hackers can plan for a cyberattack that can cause larger power outages, people are starting to grasp the severity of the problem.”

Ten points out that hackers target specific parts of the control network of power infrastructure and they focus on the mechanisms that control it. Automated systems control much of the grid from generation to transmission to use. As Ten puts it, the convenience and cost reduction of automation streamlines the process, but without solid security measures, it also makes the systems vulnerable. The interconnectedness of the grid can also cause cascading impacts leading to blackouts, equipment failure and islanding where regions become cut off and isolated from the main power grid.

Emerging Cybersecurity Threats

Ten and his team draw connections and assess weaknesses using a framework that would constantly assess the bottleneck of a power grid and its interconnection with their neighboring grids. Using quantitative methods to prioritize cybersecurity protection will ensure power grids are operated in a more secure and safer manner. Ten says it’s like measuring blood pressure.

“You know your health is at risk because we monitor systolic and diastolic numbers, so perhaps you work out more or eat healthier,” Ten says. “The grid needs established metrics for health too, a number to gauge if we are ready for this security challenge.”

With a better understanding of the system’s weaknesses, it’s easier to be strategic and shore up security risks. In the long run, Ten says improving regulations with specifics to match actual infrastructure needs and providing cybersecurity insurance will help.

“Simply because the remote substation networks are constantly commissioned with full compliance doesn’t mean they are secure,” Ten says. “There is going to be a tremendous impact if we’re negligent and fail to keep up with changes in communication infrastructure and emerging security threats.”

The Internet and your brain

“The founders of the Internet spent a lot of time considering how to make information flow efficiently,” says Salk Assistant Professor Saket Navlakha, coauthor of the new study that appears online inNeural Computation on February 9, 2017. “Finding that an engineered system and an evolved biological one arise at a similar solution to a problem is really interesting.”

In the engineered system, the solution involves controlling information flow such that routes are neither clogged nor underutilized by checking how congested the Internet is. To accomplish this, the Internet employs an algorithm called “additive increase, multiplicative decrease” (AIMD) in which your computer sends a packet of data and then listens for an acknowledgement from the receiver: If the packet is promptly acknowledged, the network is not overloaded and your data can be transmitted through the network at a higher rate. With each successive successful packet, your computer knows it’s safe to increase its speed by one unit, which is the additive increase part. But if an acknowledgement is delayed or lost your computer knows that there is congestion and slows down by a large amount, such as by half, which is the multiplicative decrease part. In this way, users gradually find their “sweet spot,” and congestion is avoided because users take their foot off the gas, so to speak, as soon as they notice a slowdown. As computers throughout the network utilize this strategy, the whole system can continuously adjust to changing conditions, maximizing overall efficiency.

Navlakha, who develops algorithms to understand complex biological networks, wondered if the brain, with its billions of distributed neurons, was managing information similarly. So, he and coauthor Jonathan Suen, a postdoctoral scholar at Duke University, set out to mathematically model neural activity.

Because AIMD is one of a number of flow-control algorithms, the duo decided to model six others as well. In addition, they analyzed which model best matched physiological data on neural activity from 20 experimental studies. In their models, AIMD turned out to be the most efficient at keeping the flow of information moving smoothly, adjusting traffic rates whenever paths got too congested. More interestingly, AIMD also turned out to best explain what was happening to neurons experimentally.

It turns out the neuronal equivalent of additive increase is called long-term potentiation. It occurs when one neuron fires closely after another, which strengthens their synaptic connection and makes it slightly more likely the first will trigger the second in the future. The neuronal equivalent of multiplicative decrease occurs when the firing of two neurons is reversed (second before first), which weakens their connection, making the first much less likely to trigger the second in the future. This is called long-term depression. As synapses throughout the network weaken or strengthen according to this rule, the whole system adapts and learns.

“While the brain and the Internet clearly operate using very different mechanisms, both use simple local rules that give rise to global stability,” says Suen. “I was initially surprised that biological neural networks utilized the same algorithms as their engineered counterparts, but, as we learned, the requirements for efficiency, robustness, and simplicity are common to both living organisms and the networks we have built.”

Understanding how the system works under normal conditions could help neuroscientists better understand what happens when these results are disrupted, for example, in learning disabilities. “Variations of the AIMD algorithm are used in basically every large-scale distributed communication network,” says Navlakha. “Discovering that the brain uses a similar algorithm may not be just a coincidence.”

Germanium tin laser could increase processing speed of computer chips

A multi-institutional team of researchers, led by University of Arkansas engineering professor Shui-Qing “Fisher” Yu and a leading semiconductor equipment manufacturer, have fabricated an “optically pumped” laser made of the alloy germanium tin grown on silicon substrates.

The augmented material could lead to the development of fully integrated silicon photonics, including both circuits and lasers, and thus faster micro-processing speed at much lower cost.

The researchers’ findings were published in Applied Physics Letters.

Germanium tin holds great promise as a semiconducting material for future optical integration of computer chips, because it harnesses efficient emission of light, which silicon, the standard material for making computer chips, cannot do. In recent years, materials scientists and engineers, including Yu and several of his colleagues on this project, have focused on the development of germanium tin, grown on silicon substrates, to build a so-called optoelectronics “superchip” that can transmit data much faster than current chips.

Yu and his colleagues’ most recent contribution to this effort is an optically pumped laser using germanium tin. Optically pumped means the material is injected with light, similar to an injection of electrical current.

“We reduced the laser threshold 80 percent at a lasing operation temperature up to 110 Kelvin,” Yu said. “This is significant progress compared with the previously reported best result and shows that germanium tin holds great promise as an on-chip laser.”

The temperature 110 Kelvin is equal to about -261 Fahrenheit.

On this project, Yu and his colleagues worked with ASM America Inc.’s research and development staff, who developed the growth methods. ASM’s methods produce low-cost and high-quality germanium tin in an industry standard chemical vapor deposition reactor.