genSpace

 

Active Projects »

Record/Replay Bug Reproduction for Java

There will inevitably continue to be bugs that are not detected by any testing approach, but eventually impact users who then file bug reports. Reproducing field failures in the development environment can be difficult, however, especially in the case of software that behaves non-deterministically, relies on remote resources, or has complex reproduction steps (the users […]

 

Toward Trustworthy Mutable Replay for Security Patches

Society is increasingly reliant on software, but deployed software contains security vulnerabilities and other bugs that can threaten privacy, property and even human lives. When a security vulnerability or critical error is discovered, a software patch is issued to attempt to fix the problem, but patches themselves can be incorrect, inadequate, and break necessary functionality. […]

 

Dynamic Information Flow Analysis

We are investigating an approach to runtime information flow analysis for managed languages that tracks metadata about data values through the execution of a program. We first considered metadata that propagates labels representing the originating source of each data value, e.g., sensitive data from the address book or GPS of a mobile device that should […]

 

Sound Build Acceleration

Sound Build Acceleration: Our empirical studies found that the bulk of the clock time during the builds of the ~2000 largest and most popular Java open source software applications is spent running test cases, so we seek to speed up large builds by reducing testing time. This is an important problem because real-world industry builds […]

 

Dynamic Code Similarity

“Code clones” are statically similar code fragments dispersed via copy/paste or independently writing lookalike code; best practice removes clones (refactoring) or tracks them (e.g., to ensure bugs fixed in one clone are also fixed in others). We instead study dynamically similar code, for two different similarity models. One model is functional similarity, finding code fragments […]

 
 

About genSpace

 

geWorkbench (genomics Workbench) is a Java-based open-source platform for integrated genomics. Using a component architecture it allows individually developed plug-ins to be configured into complex bioinformatic applications. At present there are more than 70 available plug-ins supporting the visualization and analysis of gene expression and sequence data. Example use cases include:

  • loading data from local or remote data sources.
  • visualizing gene expression, molecular interaction networks, protein sequence and protein structure data in a variety of ways.
  • providing access to client- and server-side computational analysis tools such as t-test analysis, hierarchical clustering, self organizing maps, regulatory neworks reconstruction, BLAST searches, pattern/motif discovery, etc.
  • validating computational hypothesis through the integration of gene and pathway annotation information from curated sources as well as through Gene Ontology enrichment analysis.

genSpace is a suite of collaboration plugins to geWorkbench aimed to support knowledge sharing among computational biologists based on popular social networking motifs.  genSpace logs all user activities to a backend server, and data mines this information to recommends tools and workflows (sequences of analysis and visualization tools) in “people like you” style.  It also supports Facebook-like friends (direct collaborators) and networks (colleagues from same lab, institution or community), presence facilities including available/away/offline and live activity feed, and a shared research notebook that documents the details of all analyses. The introduction of genSpace web services can be found here.

This research is in collaboration with the Center for the Multiscale Analysis of Genomic and Cellular Networks (MAGNet) on the Columbia University Health Sciences campus, which is funded by NIH and NCI.

Team Members

Faculty

Prof. Gail Kaiser, kaiser [at] cs.columbia.edu

PhD Students
Fang-Hsiang (Mike) Su, mikefhsu [at] cs.columbia.edu

Former PhD Students and MS GRAs
Jon Bell, jbell [at] cs.columbia.edu
Swapneel Sheth, swapneel [at] cs.columbia.edu
Chris Murphy, cmurphy [at] cs.columbia.edu
Nikhil Sarda, ns2847 [at] columbia.edu

Project Students
John Murphy, jvm2108@columbia.edu
Abhaar Gupta, ag3468@columbia.edu

Former project students
Yu Wang
Ami Kumar
Huimin Sun
Diana Chang
Anureet Dhillon
Gowri Kanugovi
Mayur Lodha
Koichiro Matsunaga
Lakshmi Nadig
Joshua Nankin
Cheng Niu
Gaurav Pandey
Hyuksoo Seo
Yuan Wang
Eric Schmidt
Nan Luo
Danielle Cauthen
Flavio Antonelli
Ning Yu
Jason Halpern
Evgeny Fedetov
Aditya Bir
Alison Yang

Links

Papers, Presentations, etc.

C2B2 retreat poster and slides, May 2013
C2B2 retreat poster and slides, May 2012
DEIT 2011 paper and slides – “Towards using Cached Data Mining for Large Scale Recommender Systems”
RSSE 2010 paper and poster – “The weHelp Reference Architecture for Community-Driven Recommender Systems”
C2B2 retreat posters (1 and 2), April 2010
SSE 2010 paper and workshop presentation – “weHelp: A Reference Architecture for Social Recommender Systems”
C2B2 retreat presentation and poster, March 2009
SoSEA 2008 paper and workshop presentation – “genSpace: Exploring Social Networking Metaphors for Knowledge Sharing and Scientific Collaborative Work”
C2B2 retreat presentation and poster, April 2008

Documentation
genSpace wiki
geWorkbench wiki
C2B2 project management wiki

Source Code
geWorkbench repository (login required)

 

Contact: Fang-hsiang (Mike) Su

For information on openings for our various projects, please see our student ads