The Year of the Thesis: 2008

Thursday, April 3, 2008

Tools

We study the evolution of software systems with the help of tools. By doing this we obtain a better understanding of the domain, and we reuse that knowledge in improving the tools themselves. Having better tools we can learn new facts about the systems we study.

Eventually the users of the tool will also benefit from the tools which now are much more powerful.

Sunday, March 23, 2008

Top-Down Software Exploration

Another top-down approach is the Software Reflexion Model by Murphy, Notkin,
and Sullivan (1995). The Software Reflexion Model is to capture and exploit the
differences that exist between the source code organization and the designer’s
mental model of the high-level system organization. An engineer defines a high-
level model of the structure of the system and specifies how the model maps to
the source. A tool then computes a software reflexion model that shows where the
engineer’s high-level model agrees with and where it differs from a model of the
source. The primary purpose of this technique is to streamline the amount of time
it takes for someone unfamiliar with the system to understand its source code
structure. (from koshcke-thesis)

__

One reason for top-down exploration is supporting supervision. The user needs to
see "his" system evolve.

Friday, March 21, 2008

Related Work on Module Depdendencies

http://www.eclipsezone.com/articles/lattix-dsm/ - matrix representation of dependencies. Implemented and integrated in Eclipse. By Lattix inc.

Tuesday, March 18, 2008

inherit and reuse...

From OAuth documentation: "While we wanted the best protocol we could design, we also wanted one that people would use and that would be compatible with existing authentication methods, inherit from existing RFCs and reuse web standards wherever possible. In this way, we ended up leaving a lot out of the spec that seemed interesting but ultimately didn’t belong."

Monday, March 17, 2008

[micro-survey] metamodels of hierarchical exploration tools

what are the metamodels that people use when doing hierarchical graph analysis of software systems.

shrimp - the metamodel is a generalized nested graph. in "manipulating and documenting software structures" (stor-manip) storey discusses the introduction o fhte nested graphs formalism by d. harel.
(harel-visform)

Nested graphs, in addition to nodes and
arcs, contain composite nodes which are used for denoting
set inclusion. The containment or nesting feature
of composite nodes implicitly cornmunicates the
parent-child relakionships in a hierarchy. In SHriMP a
non-leaf node is open when its children are visible and
closed when its (children are hidden from view.
[...]
Composite nodes correslpond to subsystems in the software.
Composite arcs represent a collection of dependencies.
Composite nodes may contain other composite nodes
and arcs as well as atomic nodes and arcs. This nesting
feature of nodes communicates the hierarchical structure
of the software (e.g. subsystem or class hierarchies)

rigi

rigi's model seems to be pretty cool and general.

The Rigi model is a conceptual model for the representation
and organization of the “bricks” and “mortar” of complex
software systems. It is best characterized as a special purpose
semantic network (graph) data The semantics of the
model are a careful definition of the meaning and usage of the
nodes and arcs. The nodes and arcs of the graph represent the
components of a software system and their dependencies. For
example, a node may represent a subsystem, an interface, a variant,
a revision, a specification, a data set, or a picture; an arc may
represent a change, compilation, binding, revision, or aggregation
dependence.

Bauhaus

really impressive piece of engineering.

Information extracted from the source code is represented
in a resource graph (RG) [2], which abstracts from
a particular source language by only representing global
information such as call, type, and use relations.

Examples of relationships range from information
that can be directly extracted from the source code (e.g.,
function calls) to more abstract concepts (e.g., communication
between a client and a server)

The resulting graph can then be visualized and manipulated
in Bauhaus with our extension of the graph editor Rigi

(from the xfig paper)

micro-survey which may be related: is interaction modes. do the composite nodes need to be mapped on existing programming
language concepts such as packages and modules and difrectories or they can be anything?

Sunday, March 16, 2008

...

what is the one thing that in this hour/day/week/month is the most important thing to do to move forward?

programming in the large vs. programming in the small

remer-pitl.

the font of the article is beautiful. the language is beautiful too:

todos

- read the paper of charles simony on "meta-programming"

...

the thesis could just as well be called - "Software Evolution in the Large"

wolfmaier's lessons

(wolf-lessons)
finished reading wolfmaier's paper. pretty cool stuff. the guys talk from experience.

- effort of analyzing a system is abour 10-12 man days.
- high level analysis is applied. critical metrics are detected. expected metrics values.
- good source of heuristics that can be mapped on top of the softwarenaut view.
- absolutely necesary to involve the architects
- sotograph - seems to have been useful for them.
- generated code has to be treated differently

19.07 - 19.28

[micro-survey] Clustered Decomposition Evolution

Did anybody look at the evolution of the clustered decompositions of the system?

This would be an interesting study in itself. If hard.

But why? And who cares? And what problems can be solved by doing this?

Saturday, March 15, 2008

...

The difference between Marx and Einstein is that whereas Marx would try to find data to justify a conclusion that he already believed, Einstein went out of his way to find data that might disprove his theory.

harel - on visual formalisms

graphs and hierarchical graphs are useful for representing a relation r over a set. venn diagrams are useful for representing sets and relations between sets. putting them together. leonard euler (swiss) introduced both graphs and venn diagrams.

fast-read

fast-read: technique. while reading a paper. read a reference in 5 minutes and write down its summary here on the blog for further reference. if i postpone its reading for another time i will never read it. if i read it in detail i will spend too much time and will never finish the initial task - especially since te paper itself can have other interesting references. only in exceptional cases i can spawn a fast-read from anther fast-read.

...

the literature that you can cite is infinite. the skill resides in being able to filter out the irrelevant papers. and to abstract away the details in the relevant ones.

Friday, March 14, 2008

supervising vs. understanding

[micro-survey]

I think that actually the understanding point of view has been taken multiple times already. It would be interesting to check if I could actually switch focus onto the controlling and supervising instead. A nice paper name: "Super-Users Supervising Super-repositories". Hmm, in the same time everything that I published has been publishd in reverse engineering so switcing domains completely might not be a good idea.

how to gather requirements

There are multiple ways in which one can gather requirements for the task of understanding and supervising the evolution of a sytem from the 10'000ft perspective as eric sais it.

distill requirements from informal interviews with individual experts
-- erik donnenburg
-- adi groza
-- reinout heeck
-- contact mihu and see if i can talk with anybody at siemens

find requirements in the literature
- i don't remember seing requirements for this level...

find requirements on discussions on the net
-- Tips for reading code on the c2 wiki Library calls, Search, ...
-- Discussion on slashdot on how do you reverse engineer half a million lines of c/c++ code.

a formal survey
- survey separately[ the researchers and ask what do they thing is useful. then survey the potential users and see what *they* think it is helfpul.

Thursday, March 13, 2008

Links

Tips for reading code on the c2 wiki

Library calls, Search, ...

Rails is 100% magic and 0% design

Can you automatically detect any of these problems? That would be quality assesment

Survey Bricolage

October 2005

Discussion on slashdot on how do you reverse engineer half a million lines of c/c++ code.

Could you build a questionaire with questions that you find on the public forums? How do you make sure that the forum is the right one? If you ask on the moose mailinglist you will get a different answer than when you ask on yahoo groups.

Tuesday, March 11, 2008

...

Good looking flash-based charts - could be a nice replacement for the current charts in SPO http://www.amcharts.com/

Cool solutions for charts and graphs

Monday, March 10, 2008

Survey on Architectural-Level Evolution

Part of the future platform will be a linked SPO-Snaut view. Using Softwarenaut - one would interactively define a reference point architectural view/perspective p. Then he would be able to take p and compare it with a previous or later version of the system. The results would be visible graphically on the snaut-like view.

Q: Could I link also the authors with the snaut view - show me the changes that Mircea did tothe system? The places where mircea worked on the system? Who cares about that?

Q: Could I automatically detect phases in the evolution of the system? Who cares about this?

Following is a bunch of work which uses visual representation of the difference between architectural views in a system.

When implementing this, I should generate the differences between the architectures as separate data streams. The superposition of the

--

Grouping is part of this.
Filtering is part of this.
A good reference is the survey of Ivica Aracic. Interactive Grouping and Filtering in Graph-Based Softvis

Saturday, March 8, 2008

...

Adobe AIR - runtime that allows applications to cache work locally when the network is down and sync later when the connection is reestablished. This might be one powerful way of developing cross-platform java/flash applications. And yet again, the solution is online... This is why it is good to go with the SPO + Softwarenaut online. Sooner or later, somebody else will do this.

archjava

ArchJava - 2002 - 233 citations - extensions to the java programming language that support the definition of components and ports and allow these components to be interconnected. the idea is nice. however, it seems that the webpage of the tool was not modified since 2005. what does this mean?

- For softwarenaut this might mean that it would be cool once that we have an "authoritative" view on the system to discover the common denominator and the exhaustive interfaces of various modules. ?any other ways? remove outliers?

Hierarchical Reflexion Models - Kosche and Simon. Techniques such as reflexion models could easily be super-imposed on top of the e-softwarenaut.

Monday, March 3, 2008

...

everybody makes movies as spielberg. because spielberg risked and was succesfull and now everybody copies him.

http://www.youtube.com/watch?v=4SLGT_bXe38&NR=1

be brave in terms of what you expose yourself to
be brave in making yourself foolish
be brave in making mistakes
be brave ...

mining software compilations

Mining large software compilations over time: another perspective of software evolution - analyzing software compilations.

in the same category is also the work of D. Germain presented at WCRE 2007

why not architecture

Browsing through the "Distributed Systems" book of Tanenbaum and Van Steen. In chapter 2, where they talk about software architectures, they talk about things much more level than whan reverse engineering calls architecture. They talk about object based, data-centererd, event-based, layered architectures. What in reverse engineering people call architecture, they would probably call components (at most).

From this point of view, the thesis should not be called anything architecture, but rather something like Understanding the macroscopic evolution of software. The title is cool because it easily accomodates the various levels of macro at which one can analyze the software. One macro-level is the project-level. The other level of macro is the super-project/super-repository/project-portfolio level.

In fact, it is in the context of software super-repositories that the architecture makes more sense. If I am a consultant, I don't need to reverse-engineer a system to find out its architecture. Probably people can tell me that it is a layered and an event-based system. On the other hand, if I am looking at Sourceforge, it becomes very interesting to have the possibility of detecting the projects that have an event-based architecture.

In this context, also the cool table makes sense. The granularity levels are the macroscopic levels of software. The time dimension is the evolution. And then the thesis will be about understanding software evolution at all these levels: what are the types of analysis that work at the corresponding levels, who are the stakeholders that are interested at the various levels.

Investigate further:
- the analogy with the fashion and culture. Fashion changes very fast - it is not relevant at the macroscopic level. culture changes slow, and it is the things that matter.

Open questions:
Where do the requirements fit into the picture? How do the tests fit?

lens

Listening to danah boyd on the IT conversations podcast. The metaphor of lens. It has already been used in software evolution research in EvoLens

Evolens

There are a few other interesting concepts in the EvoLens paper
- focal points: can exist at any abstraction level. one for every view. reduce the amount of graphical elements that are being visualized.

masters

One can not serve both the two masters of Time and Quality. For either he will respect the one and sacrifice the other, or he will be devoted to one and disregard the other.

Wednesday, February 27, 2008

team development plugins in Eclipse

XCDE (eXtreme Collaborative Development Environment) is a plugin for the Eclipse IDE that adds support for real-time collaborative editing over the Internet, letting developers work on the same files ...

XCDE (eXtreme Collaborative Development Environment) is a plugin for the Eclipse IDE that adds support for real-time collaborative editing over the Internet, letting developers work on the same files ...

XecliP is a Plug-in for Eclipse that supports distributed pair programming. With this plug-in two developers are able to work together on the same Java project (eg. via internet or within a local netw...

JGear Team Client for Eclipse makes it easier to manage and view your projects across development teams. Using TeamInsight collaboration features, individual developers have a unified real-time view of their project responsibilities for bugs, change requests, code notes, tasks and requirements. The entire team has a shared project web portal with live data and statistics on team vector and velocity. Co

Tuesday, February 19, 2008

organizing the publications page

A nice way of organizing your research papers by one guy working at microsoft: based on the project that you are working on. that's not really new, but for his organization, the papers are secondary to the description of
the project.

http://research.microsoft.com/~desney/projects.htm

_

looking at CHI '08. It would be interesting to attend the conference. If nothing else, it would be a good opportunity to visit florence.

What is a good PhD in HCI?
- some people think that showing proof of the capacity of doing interdisciplinary research is important
http://www.dcs.gla.ac.uk/~johnson/papers/phd.html

Monday, January 21, 2008

Understanding dependencies - Related Work

Architecture Analysis Tools to Support Evolution of Large Industrial Systems
by Rotschke and Krikhaar. 2002

The toolset can present details about a dependency between two modules. Details are in html format.

An interesting case of applying architecture evolution analysis tools in the reengineering of an large industrial system who applied trend metrics.

GASE: visualizing software evolution-in-the-large
Holt and Pak. 1996

Holt and Pak in 1996 were among the first to attempt to visualize the evolution of software systems at the architectural level. As a case study, they used a system with a known architecture (subsystems and relations).

GASE is a tool for visualizing structural software evolution. Dependencies between modules don't seem to be aggregated so the result is quite messy. Two versions of a software system are presented on the same diagram and color coding higlights to which version the elements belong. Elements = modules + edges.

Observations
* Most changes occured between modules not subsystems
* The software system always grew and rarely shrank
* One developer saw a dependency which was wrong

Characterizing the evolution of class hierarchies
by Girba et. al. 2005

Class hierarchy evolution where deleted dependencies are represented and color coded.

Annotated Dependency Graphs for Software Understanding
by Hassan & Holt. 2003

Function level dependency graph. Automatically annotate a dependency graph with information regarding each dependency. Information is retrieved from the versioning system: rationale for change, creator.

The visual representation of the adg uses also colors to represent newly added dependencies and deleted ones.

The inter-function dependency graph can be 'lifted' to directory leve.

Visualizing the behavior of object-oriented systems
by De Pauw et. al. 1993

They instrument oo programs and monitor their execution. They present a rich collection of visualizations of the collected data. Some of the visualizations present inter-class dependencies represented as dependency matrices.

Package Surface Blueprints: Visually Supporting the Understanding of Package Relationships
by Ducasse, Pollet et al. 2007

Their paper focuses on architecting a compact representation of the inheritance and invocation dependencies between packages in an object-oriented system.

The visualisation uses a matrix based layout which highlights both the way a given package depends on others in the system as well as also the way the classes internal to the package
depend on one another.

Exploring Inter-Module Dependencies in Evolving Software Systems
by Lungu and Lanza

The authors analyze dependencies between packages. Their analysis is integrated in the interactive software exploration tool called Softwarenaut. They analyze multiple versions of a software system and create evolution diagrams which present the evolution of the relationship between two packages. For presenting the details of a given dependency in a given version they use a matrix-based visualization.

They also extract various types of dependency patterns.

Wednesday, January 16, 2008

Swanke's intelligent tool

----------------

Brainstorming with Romain about a tool which would automatically suggest the moving of entities from one module to the other based on
coupling bewteen these entities and the other modules.

In the simplest case you have two modules, and you move an entity if the number of dependencies
decreases. Generalizing you can suggest a move operation if the total number of inter-module dependencies is decreased. This would be pretty straightforward to implement
and I wonder if there is something like this in Eclipse already.

One paper in this domain is Robert Swanke: "An Intelligent Tool for Re-engineering Software Modularity".
- he measures 'design coupling'
- every method has a set of 'features' which are defined as the set of methods which are not local.
- based on the features the methods can be clustered hierarchically - the clustering can be automated or interactive. the clustered decompositions can be compared to the actual decompostition.
- detecting mavericks - misplaced procedures - detected when a procedure has too many bad neighbours

Tuesday, January 15, 2008