Program Analysis Group publications by topic

Also see:

Our primary research area is programmer productivity, which spans the spectrum from software engineering, to program analysis (both static and dynamic), to programming language design. Here are the group's publications. (Superseded papers are not listed.)

Contents:

Programming language design

Using predicate fields in a highly flexible industrial control system (Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2005), 2005)
Predicate-oriented programming has not previously been evaluated outside a research context. This paper describes a case study that evaluates the use of predicate fields in a large-scale application that is in daily use.
Practical pluggable types for Java (ISSTA 2008, Proceedings of the 2008 International Symposium on Software Testing and Analysis, 2008)
We have created a framework for pluggable type-checking in Java (including backward-compatible syntax). Checkers built with the framework found real errors in existing software.
Type Annotations specification (JSR 308) (2008)
This specification extends Java's annotation system to permit annotations on any use of a type (including generic type arguments, method receivers, etc.). It is planned for inclusion in Java 7, but it is usable immediately with full backward compatibility with existing compilers and JVMs.
Featherweight Ownership and Immutability Generic Java (FOIGJ) (TR, 2009)
This paper prestents a formalism and proofs of a type system (OIGJ) that combines ownership with immutablity. This report has since been superseded.
Ownership and immutability in generic Java (Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2010), 2010)
This paper shows how to combine ownership and immutability, producing a result that is more expressive and safe than previous attempts. The paper includes both a proof of soundness and an implementation with case studies.
Always-available static and dynamic feedback (ICSE'11, Proceedings of the 33rd International Conference on Software Engineering, 2011)
The DuctileJ language enables the rapid development and flexibility of a dynamically-typed language, and the reliability and comprehensibility of a statically-typed language, via two views of the program during development.
Building and using pluggable type-checkers (ICSE'11, Proceedings of the 33rd International Conference on Software Engineering, 2011)
This paper evaluates the ease of pluggable type-checking with the Checker Framework. The type checkers were easy to write, easy for novices to use, and effective in finding hundreds of errors in case studies of 2 million lines of code.

Immutability (side effects)

A practical type system and language for reference immutability (Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2004), 2004)
This paper presents a type system, language, implementation, and evaluation of a safe mechanism for enforcing reference immutability, where an immutable reference may not be used to cause side effects to any object reachable from it.
Javari: Adding reference immutability to Java (Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2005), 2005)
A compiler can guarantee that an immutable reference is not used to cause side effects to any reachable object. This paper extends previous proposals in many ways, including formal type rules and support for Java generics.
Object and reference immutability using Java generics (ESEC/FSE 2007: Proceedings of the 11th European Software Engineering Conference and the 15th ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2007)
This paper shows how to provide compiler-checked guarantees of both object immutability and reference immutability, without changes to Java syntax, by providing a single “immutability” generic parameter for each class.
Inference of reference immutability (ECOOP 2008 — Object-Oriented Programming, 22nd European Conference, 2008)
This paper presents a precise, scalable, novel algorithm for inference of reference immutability (as defined by the Javari language), and an experimental evaluation on substantial programs.
A formal definition and evaluation of parameter immutability (Automated Software Engineering, 2009)
This paper defines reference immutability formally and precisely. The paper gives new algorithms for inferring immutability that are more scalable and precise than previous approaches. The paper compares our algorithms, previous algorithms, and the formal definition.

Static analysis

A C/C++ Front End for the Daikon Dynamic Invariant Detection System (Masters thesis, 2002)
An instrumenter for data structures in C and C++ programs must determine whether a given pointer is valid; whether it points to a single element or an array; if an array, how large, and how much of it is in use. This thesis solves these and related issues in C/C++ run-time instrumentation.
Selecting Predicates for Conditional Invariant Detection Using Cluster Analysis (Masters thesis, 2002)
This thesis proposes the use of clustering, an artificial intelligence technique, for splitting heterogeneous program analysis data into homogeneous parts. The technique is effective in creating implications of the form “p => q” that are useful in verification and bug detection.
An empirical analysis of C preprocessor use (IEEE Transactions on Software Engineering, 2002)
Undisciplined use of the C preprocessor can greatly hinder program understanding and manipulation. This paper examines 1.2 million lines of C code to determine what types of problematic uses appear in practice.
Selecting, refining, and evaluating predicates for program analysis (TR, 2003)
This paper investigates several techniques, notably dynamic ones based on random selection and machine learning, for predicate abstraction — selecting the predicates that are most effective for a program analysis.
Efficiently refactoring Java applications to use generic libraries (ECOOP 2005 — Object-Oriented Programming, 19th European Conference, 2005)
This paper gives a type-constraint-based method for converting non-generic uses of Java collections into generic ones. In its context-sensitive version it can also generify methods. It is implemented in Eclipse.
Static deadlock detection for Java libraries (ECOOP 2005 — Object-Oriented Programming, 19th European Conference, 2005)
This paper gives a technique for determining whether any invocation of a library can lead to deadlock in the client program; it can also be extended to the closed-world (whole-program) case.
HAMPI: A solver for string constraints (ISSTA 2009, Proceedings of the 2009 International Symposium on Software Testing and Analysis, 2009)
This paper describes an efficient and expressive solver (that is, a decision procedure) for string constraints such as language membership. Such a solver is useful when analyzing programs that manipulate strings.
Inference of field initialization (TR, 2010)
This paper presents a flow-sensitive, exception-aware static analysis that infers what program variables might hold not-yet-fully-initialized objects. Such an object might violate its object invariants, until its initialization is complete.
Featherweight Ownership and Immutability Generic Java (FOIGJ) — technical report (TR, 2010)
This technical report contains proofs that were omitted from our paper entitled “Ownership and Immutability in Generic Java”, which appears in OOPSLA 2010.
Inference of field initialization (ICSE'11, Proceedings of the 33rd International Conference on Software Engineering, 2011)
This paper presents a flow-sensitive, exception-aware static analysis that infers what program variables might hold not-yet-fully-initialized objects. Such an object might violate its object invariants, until its initialization is complete.

Security

A Machine-Checked Safety Proof for a CISC-Compatible SFI Technique (TR, 2006)
This report summarizes a machine-checked proof in ACL2 of the safety of a technique to enforce memory and control-flow safety for x86 binaries by machine-code rewriting.
Evaluating SFI for a CISC Architecture (15th USENIX Security Symposium, 2006)
This paper describes an instruction-level rewriting technique to enforce memory and control-flow safety. A prototype implementation demonstrates the technique's scalability, and a machine-checked proof formalizes its security guarantee.
A simulation-based proof technique for dynamic information flow (PLAS 2007: ACM SIGPLAN Workshop on Programming Languages and Analysis for Security, 2007)
A dynamic analysis of information flow (e.g., end-to-end confidentiality) is sound if its estimates of the amount of information revealed can never be too low. This paper proves such a soundness result by simulating an analyzed program with a pair copies connected by a limited-bandwidth channel.
Quantitative Information-Flow Tracking for Real Systems (Ph.D. dissertation, 2008)
This thesis describes all our work on quantitative information-flow analysis through mid-2008: the formalization from PLAS 2007, the graph-based algorithms and case studies from PLDI 2008, more details on tainting-based analysis and efficient graph algorithms from earlier TRs, and some improved presentation and discussion.
Quantitative information flow as network flow capacity (PLDI 2008, Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, 2008)
To obtain a more precise measurement of the amount of secret information a program might reveal, we replace the usual technique of tainting (reachability from secret inputs) with a maximum-flow computation on a graph representation of execution with edge capacities. With appropriate optimizations, the technique scales to check realistic properties in several large C programs.
Automatic creation of SQL injection and cross-site scripting attacks (ICSE'09, Proceedings of the 31st International Conference on Software Engineering, 2009)
By generating inputs that cause the program to execute particular statements, then modifying those inputs into attack vectors, it is possible to prove the presence of real, exploitable security vulnerabilities in PHP programs.
Automatically patching errors in deployed software (Proceedings of the 21st ACM Symposium on Operating Systems Principles, 2009)
The ClearView system automatically: classifies executions as normal or attack; learns a model of the normal executions; generates patches that correct deviations from the model during an attack; evaluates the patches; and distributes the best one. A hostile Red Team evaluation shows that ClearView is effective.

Mining software repositories

Predicting faults from cached history (ICSE'07, Proceedings of the 29th International Conference on Software Engineering, 2007)
Faults often co-occur, so when one is fixed, another one may be lurking nearby. Using this intuition, this paper aims to identify a small number of files that contain most of the faults in a program.
Classifying Software Changes: Clean or Buggy? (IEEE Transactions on Software Engineering, 2008)
This paper predicts whether a given software change is likely to contain latent bugs. It uses machine learning to determine whether the change is more similar to previous buggy changes or previous non-buggy changes.
Extracting Structural Information from Bug Reports (MSR 2008: 5th Working Conference on Mining Software Repositories, 2008)
This paper presents the infoZilla tool, which parses unstructured user bug reports. This will facilitate future research that analyzes bug reports.

Refactoring

Automated support for program refactoring using invariants (ICSM 2001, Proceedings of the International Conference on Software Maintenance, 2001)
This paper shows how program invariants can productively be used to identify candidate refactorings, which are small-scale program transformations that improve program structure, performance, or other features.
Converting Java programs to use generic libraries (Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2004), 2004)
When type parameters are added to library code, client code should be upgraded to supply parameters at each use of library classes. This paper presents a sound and precise combined pointer and type-based analysis that does so.
Refactoring for parameterizing Java classes (ICSE'07, Proceedings of the 29th International Conference on Software Engineering, 2007)
Programmers must change their programs to take advantage of generic types. This paper presents a type-constraint-based analysis that converts non-generic classes such as List into generic classes such as List<T>.
How do programs become more concurrent? A story of program transformations (TR, 2008)
This paper is a historical analysis of the transformations that programmers used to convert sequential programs into concurrent versions.
Refactoring sequential Java code for concurrency via concurrent libraries (ICSE'09, Proceedings of the 31st International Conference on Software Engineering, 2009)
This paper presents a refactoring tool, Concurrencer, that transforms sequential code into parallel code that uses the java.util.concurrent libraries.
Refactoring using type constraints (TR, 2009)
A program is one solution to the type constraints that are imposed by its interfaces and use of libraries. Other solutions may be possible, and they represent legal refactiorings of the program.

Testing

Improving test suites via operational abstraction (ICSE'03, Proceedings of the 25th International Conference on Software Engineering, 2003)
This paper proposes a technique for selecting test cases that is similar to structural code coverage techniques, but operates in the semantic domain of program behavior rather than in the lexical domain of program text. The technique outperforms branch coverage in test suite size and in fault detection.
Reducing wasted development time via continuous testing (Fourteenth International Symposium on Software Reliability Engineering, 2003)
Early notification of problems enables cheaper fixes. This paper evaluates how much developer time could be saved by continuous testing, which uses extra CPU cycles to continuously run tests in the background.
Continuous testing in Eclipse (2nd Eclipse Technology Exchange Workshop (eTX), 2004)
This paper describes the design principles, user interface, architecture, and implementation of a publicly-available continuous testing plug-in for the Eclipse development environment.
Automatic mock object creation for test factoring (ACM SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE'04), 2004)
A set of small, fast-running tests can be more useful than a single larger test. This paper proposes a way to automatically factor a large test case into smaller tests, using mock objects to model the environment.
An experimental evaluation of continuous testing during development (ISSTA 2004, Proceedings of the 2004 International Symposium on Software Testing and Analysis, 2004)
Continuous testing during development provides early notification of errors. This paper reports a controlled experiment to measure whether and how programmers benefit from continuous testing.
Eclat: Automatic generation and classification of test inputs (ECOOP 2005 — Object-Oriented Programming, 19th European Conference, 2005)
This paper presents an automatic mechanism for selecting tests that are likely to expose errors — tests whose run-time behavior is maximally different from succeeding runs. The paper also gives techniques for test input generation and for converting a test input into a test case.
Automatic test factoring for Java (ASE 2005: Proceedings of the 20th Annual International Conference on Automated Software Engineering, 2005)
Test factoring creates fast, focused unit tests from slow system-wide tests. Each new unit test exercises only a subset of the functionality exercised by the system test.
An empirical comparison of automated generation and classification techniques for object-oriented unit testing (ASE 2006: Proceedings of the 21st Annual International Conference on Automated Software Engineering, 2006)
This paper experimentally evaluates four test generation strategies: the four combinations of model-based vs. exception-based, and symbolic vs. random. The model-based symbolic combination is new.
Finding the needles in the haystack: Generating legal test inputs for object-oriented programs (M-TOOS 2006: 1st Workshop on Model-Based Testing and Object-Oriented Systems, 2006)
An automatically inferred model of legal method call sequences can bias test generation toward legal method sequences, increasing coverage and creating data structures beyond the ability of undirected random generation.
Feedback-directed random test generation (ICSE'07, Proceedings of the 29th International Conference on Software Engineering, 2007)
This paper presents a technique that improves random test generation by incorporating feedback from executing previously-generated inputs. The technique found serious, previously-unknown errors in widely-deployed applications.
Test factoring with amock: Generating readable unit tests from system tests (Masters thesis, 2007)
Test factoring converts a long system test into many short unit tests. This thesis extends previous work by creating unit tests that are readable, modifiable, and maintainable by programmers.
From developer's head to developer tests: Characterization, theories, and preventing one more bug (Companion to Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2007), 2007)
This poster discusses two ways to aid testers: generalized assertions called theories that must hold for any data structure created by a program, and characterization tools that infer such generalized assertions.
Theory-infected: Or how I learned to stop worrying and love universal quantification (Companion to Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2007), 2007)
A theory is a generalized unit test — a predicate that should be true for any object or combination of objects that may exist. This demo presents theories and tools that support them to improve testing.
Theories in practice: Easy-to-write specifications that catch bugs (TR, 2008)
A “theory” is a claim about one or more objects — an executable predicate that generalizes over a (possibly infinite) set of tests. Theories serve as a type of specification that can complement or supersede individual tests.
Grammar-based whitebox fuzzing (PLDI 2008, Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, 2008)
A test generation strategy should produce inputs that satisfy the structural requirements of the program under test. This paper shows how to generate inputs conforming to a grammar while satisfying path constraints.
ReCrash: Making software failures reproducible by preserving object states (ECOOP 2008 — Object-Oriented Programming, 22nd European Conference, 2008)
ReCrash is a lightweight technique that monitors a program for failures such as crashes. When a failure occurs in the field, ReCrash creates multiple unit tests that reproduce it. This eases debugging and fixing the errors.
Finding errors in .NET with feedback-directed random testing (ISSTA 2008, Proceedings of the 2008 International Symposium on Software Testing and Analysis, 2008)
Test engineers at Microsoft used feedback-directed random testing on a critical component of the .NET architecture. Feedback-directed random testing tool found new errors, at low developer cost.
Directed Random Testing (Ph.D. dissertation, 2009)
Directed random testing combines random testing with model-based testing, resulting in better coverage and fewer illegal inputs. Experiments confirm its utility, finding bugs in the Java JDK, .NET framework, etc.
Finding bugs in web applications using dynamic test generation and explicit state model checking (IEEE Transactions on Software Engineering, 2010)
This paper extends dynamic test generation, based on combined concrete and symbolic execution, to the new domain of web applications, and finds 673 bugs in 6 PHP web applications.

Bug prediction

Predicting problems caused by component upgrades (ESEC/FSE 2003: Proceedings of the 9th European Software Engineering Conference and the 11th ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2003)
A software upgrade may break a customer's system because of differences between it and the vendor's test environment. This paper shows how to predict such problems without having to integrate and test.
Finding latent code errors via machine learning over program executions (ICSE'04, Proceedings of the 26th International Conference on Software Engineering, 2004)
This paper shows the efficacy of a technique that performs machine learning over correct and incorrect programs, then uses the resulting models to identify latent errors in other programs.
Early identification of incompatibilities in multi-component upgrades (ECOOP 2004 — Object-Oriented Programming, 18th European Conference, 2004)
This paper extends the technique of “Predicting problems caused by component upgrades” [ESEC/FSE 2003] to deal with multi-module upgrades and arbitrary calling patterns. It also presents 4 other enhancements and a case study of upgrading the C standard library as used by 48 Unix programs.
Formalizing lightweight verification of software component composition (SAVCBS 2004: Specification and Verification of Component-Based Systems, 2004)
This paper formalizes (and corrects a few mistakes in) previous work on predicting problems caused by component upgrades. It presents an outline for a proof that the technique permits only sound upgrades.
Prioritizing warnings by analyzing software history (MSR 2007: International Workshop on Mining Software Repositories, 2007)
Bug-finding tools prioritize their warning messages, but often do so poorly. This paper gives a new prioritization, based on which problems programmers have fixed most quickly in the past — those are likely to be the most important ones.
Which warnings should I fix first? (ESEC/FSE 2007: Proceedings of the 11th European Software Engineering Conference and the 15th ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2007)
This paper determines how a bug-finding tool should prioritize its warnings. The prioritization is computed from the bug fix history of the program — the problems programmers considered important enough to fix.

Invariant detection

Dynamically discovering pointer-based program invariants (TR, 1999)
This paper extends dynamic invariant detection to pointer-directed data structures by linearizing those structures and, more significantly, by permitting inference of disjunctive invariants (like “A or B”).
Quickly detecting relevant program invariants (ICSE 2000, Proceedings of the 22nd International Conference on Software Engineering, 2000)
Dynamic invariant detection can output many unhelpful properties as well as those of interest. This paper gives four techniques that eliminate uninteresting invariants or add relevant ones, thus increasing the technique's usefulness to programmers.
Dynamically Discovering Likely Program Invariants (Ph.D. dissertation, 2000)
This dissertation overviews dynamic invariant detection as of summer 2000. It includes materials from the papers on the basic invariant detection techniques (IEEE TSE, 2001), relevance improvements (ICSE 2000), and extensions to pointer-based data structures (UW TR, 1999).
Dynamically discovering likely program invariants to support program evolution (IEEE Transactions on Software Engineering, 2001)
Program properties (such as formal specifications or assert statements) are useful for a variety of programming tasks. This paper shows how to dynamically infer program properties by looking for patterns and relationships among values computed at run time.
Determining legal method call sequences in object interfaces (2003)
Correct functioning of a component can depend on the component's methods being called in the correct order. This paper simplifies and improves on previous techniques for determining the legal call sequences.
Efficient incremental algorithms for dynamic detection of likely invariants (Proceedings of the ACM SIGSOFT 12th Symposium on the Foundations of Software Engineering (FSE 2004), 2004)
This paper shows how to perform dynamic invariant detection both incrementally, efficiently, and without using only a trivial grammar; previous implementations suffered from one or more such problems.
Performance Enhancements for a Dynamic Invariant Detector (Masters thesis, 2007)
As an optimization, the Daikon invariant detector suppresses (does not instantiate or check) potential invariants that are implied by instantiated/checked ones. This thesis investigates optimizations to this suppression mechanism.
The Daikon system for dynamic detection of likely invariants (Science of Computer Programming, 2007)
This paper discusses the Daikon tool, including its features, applications, architecture, and development process. It is not a paper about dynamic invariant detection per se.

Dynamic analysis

Static and dynamic analysis: Synergy and duality (WODA 2003: ICSE Workshop on Dynamic Analysis, 2003)
This position paper is intended to provoke thought and discussion regarding the relationship between static and dynamic analysis, which it claims are not as different as many have assumed.
WODA 2003: ICSE Workshop on Dynamic Analysis (2003)
This workshop brought together practitioners and academics to discuss topics from the structure of the dynamic analysis field, to how to enable better and easier progress, to specific new analysis ideas.
Improving adaptability via program steering (ISSTA 2004, Proceedings of the 2004 International Symposium on Software Testing and Analysis, 2004)
Program steering selects modalities for a program that may operate in several modes. This paper's experiments show that program steering can substantially improve performance in unanticipated situations.
Automatically generating refactorings to support API evolution (ACM SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE 2005), 2005)
This paper proposes an extremely simple source code analysis that performs as well as more complicated analyses in permitting client code to be refactored in response to library changes.
Learning from executions: Dynamic analysis for software engineering and program understanding (2005)
A broad overview of the field of dynamic analysis, including applications of dynamic analysis, dynamic analysis algorithms, and implementation details. The slides capture only part of the tutorial, and naturally the tutorial only captures part of this exciting field.
A Scalable Mixed-Level Approach to Dynamic Analysis of C and C++ Programs (Masters thesis, 2006)
This thesis introduces a mixed-level approach and toolkit for constructing dynamic analyses that combines the benefits of source-level and binary-level approaches. It also presents two dynamic analysis tools — for value profiling and for inferring abstract types.
Detection of web service substitutability and composability (WS-MaTe 2006: International Workshop on Web Services — Modeling and Testing, 2006)
Programmers, end users, and even program would like to discover and reuse software services. This paper presents an approach to indicate when Web Services may be substituted for one another or composed.
Dynamic inference of abstract types (ISSTA 2006, Proceedings of the 2006 International Symposium on Software Testing and Analysis, 2006)
This paper presents a dynamic analysis for computing abstract types, which indicate which values and variables may interact at run time. The paper also presents implementations and experiments for C++ and Java.
Inference and enforcement of data structure consistency specifications (ISSTA 2006, Proceedings of the 2006 International Symposium on Software Testing and Analysis, 2006)
This paper presents an automatic technique for recovering from data structure corruption errors. It involves inferring correctness constraints, then repairing any errors that occur at run time. The evaluation includes a hostile “Red Team” evaluation.
An Improved Scalable Mixed-Level Approach to Dynamic Analysis of C and C++ Programs (Masters thesis, 2010)
This thesis describes enhancements to tde mixed-level approach to dynamic binary analysis that was first proposed and implemented by Guo.
How tests and proofs impede one another: The need for always-on static and dynamic feedback (4th International Conference on Tests And Proofs (TAP), 2010)
This talk proposes a way to obtain the rapid development of a dynamically-typed language, and the reliability and comprehensibility of a statically-typed language, via two views of the program during development.
Synoptic: Summarizing system logs with refinement (Workshop on Managing Systems via Log Analysis and Machine Learning Techniques (SLAML '10), 2010)
Synoptic produces a small graph that summarizes a distributed system log. Synoptic starts from the smallest possible graph and iteratively expands, and it mines then enforces invariants from the original logs.

Software engineering

Panel: Perspectives on software engineering (ICSE 2001, Proceedings of the 23rd International Conference on Software Engineering, 2001)
This talk argues that tools are key to having impact on software engineering; they must be published; case studies are more fruitful than experiments; researchers have the responsibility to target the state of the art; static and dynamic tools should be combined; and lightweight formal tools are the right place to start.
Refactoring for generalization using type constraints (Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2003), 2003)
This paper describes the theoretical underpinnings behind two refactorings: Extract Interface and Pull Up Members. The technical mechanism is to use type constraints to determine which such changes are permitted.
ACM SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE 2005) (2005)
PASTE 2005 brought together the program analysis, software tools, and software engineering communities to focus on applications of program analysis techniques in software tools.
The Groupthink specification exercise (Software Engineering Education in the Modern Age: Challenges and Possibilities, 2006)
The Groupthink specification exercise is a fun group activity that teaches students about specifications, teamwork, and communication. This paper describes both the goals of the activity (along with an assessment of it), and the mechanics of how to run it.
Speculative identification of merge conflicts and non-conflicts (TR, 2010)
Are software engineers sometimes unaware that their work conflicts with that of their teammates? Are they sometimes unaware that the work has no conflicts? We investigate how often a tool could provide this information, which could help to avoid deviations or wasted work.
Rethinking the economics of software engineering (Workshop on the Future of Software Engineering Research, 2010)
We propose to build tools that permit less-skilled workers to perform certain tasks currently done by developers. This frees developers to do tasks that only they can do, and overall makes software less expensive.
Speculative analysis: Exploring future development states of software (Workshop on the Future of Software Engineering Research, 2010)
This paper proposes that tools should analyze program versions that do not exist yet but that a developer might create. This may help the developer decide in advance whether or not to create that particular version.

Verification

Static verification of dynamically detected program invariants: Integrating Daikon and ESC/Java (Proceedings of RV'01, First Workshop on Runtime Verification, 2001)
Dynamic invariant detection proposes likely (not certain) invariants; static verification requires program explicit identification of a goal and program annotation. This paper combines the two techniques to overcome the weaknesses of each.
Improved Simulation of Input/Output Automata (Masters thesis, 2001)
The IOA language is used for modeling algorithms for distributed systems in the Input-Output Automaton style. This thesis presents extensions to the IOA Simulator, which is an interpreter for the IOA language. The extensions make the Simulator more useful for testing and verification.
Verifying distributed algorithms via dynamic analysis and theorem proving (TR, 2002)
Automatic theorem-provers require human direction in the form of lemmas and invariants. We investigate a dynamic technique for providing these intermediate steps. In three case studies, they reduced or eliminated human effort.
Automatic generation of program specifications (ISSTA 2002, Proceedings of the 2002 International Symposium on Software Testing and Analysis, 2002)
Sound program verifiers generally require program specifications, which are tedious and difficult to generate. A dynamic analysis can automatically produce unsound specifications. Combining the two techniques overcomes both weaknesses and demonstrates that the dynamic step, while unsound, can be quite accurate in practice.
Invariant inference for static checking: An empirical evaluation (Proceedings of the ACM SIGSOFT 10th International Symposium on the Foundations of Software Engineering (FSE 2002), 2002)
Tool unsoundness and incompleteness may hinder users performing a task, or users may benefit even from imperfect output. In a study of program verification aided by dynamic invariant detection, even very poor output from the invariant detector aided users to a statistically significant degree.
Theorem-proving distributed algorithms with dynamic analysis (Masters thesis, 2003)
This thesis proposes a new methodology, based on dynamic analysis, for making theorem-provers easier to use. The dynamic analysis proposes both lemmas that are necessary for a proof, and proof tactics that can be used to prove the lemmas. Case studies with three distributed algorithms and two theorem-provers show the technique to be effective.
Using simulated execution in verifying distributed algorithms (Software Tools for Technology Transfer, 2004)
This paper proposes integration of dynamic analysis with traditional theorem-proving in ways that extend beyond mere testing. Generalizing over the test runs can reveal necessary lemmas, and the structure of the proof can mirror the structure of the execution.
An overview of JML tools and applications (Software Tools for Technology Transfer, 2005)
This paper overviews the Java Modeling Language (JML) notation for detailed design and gives a brief description of some of the tools that take it as an input or produce it as an output.
Verification for legacy programs (VSTTE: Verified Software: Theories, Tools, Experiments, 2005)
Legacy (under-documented and -specified) code will be with us forever. This paper suggests ways to cope with such systems, both to ameliorate short-term problems and to advance toward a future in which all code is verified.

Miscellaneous

Code Versioning in a Workflow Management System (Masters thesis, 2002)
Workflow systems permit high-level definition of code for managing some physical process. This thesis proposes and implements a workflow system that permits process definitions and implementing code to be changed, and multiple versions of the process and code to be active simultaneously.
Footloose: A Case for Physical Eventual Consistency and Selective Conflict Resolution (5th IEEE Workshop on Mobile Computing Systems and Applications, 2003)
Footloose is a user-centered distributed data store that optimistically shares data and reconciles conflicts across diverse, occasionally-connected devices.

Databases

HaLoop: Efficient Iterative Data Processing on Large Clusters (36th International Conference on Very Large Data Bases, 2010)
An iterative job may repeat similar work on each iteration. This paper shows how to avoid repeating work in iterative MapReduce jobs, substantially improving performance.

(This webpage was created with bibtex2web.)

Program Analysis Group