The Year of the Thesis: March 2008

Sunday, March 23, 2008

Top-Down Software Exploration

Another top-down approach is the Software Reflexion Model by Murphy, Notkin,
and Sullivan (1995). The Software Reflexion Model is to capture and exploit the
differences that exist between the source code organization and the designer’s
mental model of the high-level system organization. An engineer defines a high-
level model of the structure of the system and specifies how the model maps to
the source. A tool then computes a software reflexion model that shows where the
engineer’s high-level model agrees with and where it differs from a model of the
source. The primary purpose of this technique is to streamline the amount of time
it takes for someone unfamiliar with the system to understand its source code
structure. (from koshcke-thesis)

__

One reason for top-down exploration is supporting supervision. The user needs to
see "his" system evolve.

Friday, March 21, 2008

Related Work on Module Depdendencies

http://www.eclipsezone.com/articles/lattix-dsm/ - matrix representation of dependencies. Implemented and integrated in Eclipse. By Lattix inc.

Tuesday, March 18, 2008

inherit and reuse...

From OAuth documentation: "While we wanted the best protocol we could design, we also wanted one that people would use and that would be compatible with existing authentication methods, inherit from existing RFCs and reuse web standards wherever possible. In this way, we ended up leaving a lot out of the spec that seemed interesting but ultimately didn’t belong."

Monday, March 17, 2008

[micro-survey] metamodels of hierarchical exploration tools

what are the metamodels that people use when doing hierarchical graph analysis of software systems.

shrimp - the metamodel is a generalized nested graph. in "manipulating and documenting software structures" (stor-manip) storey discusses the introduction o fhte nested graphs formalism by d. harel.
(harel-visform)

Nested graphs, in addition to nodes and
arcs, contain composite nodes which are used for denoting
set inclusion. The containment or nesting feature
of composite nodes implicitly cornmunicates the
parent-child relakionships in a hierarchy. In SHriMP a
non-leaf node is open when its children are visible and
closed when its (children are hidden from view.
[...]
Composite nodes correslpond to subsystems in the software.
Composite arcs represent a collection of dependencies.
Composite nodes may contain other composite nodes
and arcs as well as atomic nodes and arcs. This nesting
feature of nodes communicates the hierarchical structure
of the software (e.g. subsystem or class hierarchies)

rigi

rigi's model seems to be pretty cool and general.

The Rigi model is a conceptual model for the representation
and organization of the “bricks” and “mortar” of complex
software systems. It is best characterized as a special purpose
semantic network (graph) data The semantics of the
model are a careful definition of the meaning and usage of the
nodes and arcs. The nodes and arcs of the graph represent the
components of a software system and their dependencies. For
example, a node may represent a subsystem, an interface, a variant,
a revision, a specification, a data set, or a picture; an arc may
represent a change, compilation, binding, revision, or aggregation
dependence.

Bauhaus

really impressive piece of engineering.

Information extracted from the source code is represented
in a resource graph (RG) [2], which abstracts from
a particular source language by only representing global
information such as call, type, and use relations.

Examples of relationships range from information
that can be directly extracted from the source code (e.g.,
function calls) to more abstract concepts (e.g., communication
between a client and a server)

The resulting graph can then be visualized and manipulated
in Bauhaus with our extension of the graph editor Rigi

(from the xfig paper)

micro-survey which may be related: is interaction modes. do the composite nodes need to be mapped on existing programming
language concepts such as packages and modules and difrectories or they can be anything?

Sunday, March 16, 2008

...

what is the one thing that in this hour/day/week/month is the most important thing to do to move forward?

programming in the large vs. programming in the small

remer-pitl.

the font of the article is beautiful. the language is beautiful too:

todos

- read the paper of charles simony on "meta-programming"

...

the thesis could just as well be called - "Software Evolution in the Large"

wolfmaier's lessons

(wolf-lessons)
finished reading wolfmaier's paper. pretty cool stuff. the guys talk from experience.

- effort of analyzing a system is abour 10-12 man days.
- high level analysis is applied. critical metrics are detected. expected metrics values.
- good source of heuristics that can be mapped on top of the softwarenaut view.
- absolutely necesary to involve the architects
- sotograph - seems to have been useful for them.
- generated code has to be treated differently

19.07 - 19.28

[micro-survey] Clustered Decomposition Evolution

Did anybody look at the evolution of the clustered decompositions of the system?

This would be an interesting study in itself. If hard.

But why? And who cares? And what problems can be solved by doing this?

Saturday, March 15, 2008

...

The difference between Marx and Einstein is that whereas Marx would try to find data to justify a conclusion that he already believed, Einstein went out of his way to find data that might disprove his theory.

harel - on visual formalisms

graphs and hierarchical graphs are useful for representing a relation r over a set. venn diagrams are useful for representing sets and relations between sets. putting them together. leonard euler (swiss) introduced both graphs and venn diagrams.

fast-read

fast-read: technique. while reading a paper. read a reference in 5 minutes and write down its summary here on the blog for further reference. if i postpone its reading for another time i will never read it. if i read it in detail i will spend too much time and will never finish the initial task - especially since te paper itself can have other interesting references. only in exceptional cases i can spawn a fast-read from anther fast-read.

...

the literature that you can cite is infinite. the skill resides in being able to filter out the irrelevant papers. and to abstract away the details in the relevant ones.

Friday, March 14, 2008

supervising vs. understanding

[micro-survey]

I think that actually the understanding point of view has been taken multiple times already. It would be interesting to check if I could actually switch focus onto the controlling and supervising instead. A nice paper name: "Super-Users Supervising Super-repositories". Hmm, in the same time everything that I published has been publishd in reverse engineering so switcing domains completely might not be a good idea.

how to gather requirements

There are multiple ways in which one can gather requirements for the task of understanding and supervising the evolution of a sytem from the 10'000ft perspective as eric sais it.

distill requirements from informal interviews with individual experts
-- erik donnenburg
-- adi groza
-- reinout heeck
-- contact mihu and see if i can talk with anybody at siemens

find requirements in the literature
- i don't remember seing requirements for this level...

find requirements on discussions on the net
-- Tips for reading code on the c2 wiki Library calls, Search, ...
-- Discussion on slashdot on how do you reverse engineer half a million lines of c/c++ code.

a formal survey
- survey separately[ the researchers and ask what do they thing is useful. then survey the potential users and see what *they* think it is helfpul.

Thursday, March 13, 2008

Links

Tips for reading code on the c2 wiki

Library calls, Search, ...

Rails is 100% magic and 0% design

Can you automatically detect any of these problems? That would be quality assesment

Survey Bricolage

October 2005

Discussion on slashdot on how do you reverse engineer half a million lines of c/c++ code.

Could you build a questionaire with questions that you find on the public forums? How do you make sure that the forum is the right one? If you ask on the moose mailinglist you will get a different answer than when you ask on yahoo groups.

Tuesday, March 11, 2008

...

Good looking flash-based charts - could be a nice replacement for the current charts in SPO http://www.amcharts.com/

Cool solutions for charts and graphs

Monday, March 10, 2008

Survey on Architectural-Level Evolution

Part of the future platform will be a linked SPO-Snaut view. Using Softwarenaut - one would interactively define a reference point architectural view/perspective p. Then he would be able to take p and compare it with a previous or later version of the system. The results would be visible graphically on the snaut-like view.

Q: Could I link also the authors with the snaut view - show me the changes that Mircea did tothe system? The places where mircea worked on the system? Who cares about that?

Q: Could I automatically detect phases in the evolution of the system? Who cares about this?

Following is a bunch of work which uses visual representation of the difference between architectural views in a system.

When implementing this, I should generate the differences between the architectures as separate data streams. The superposition of the

--

Grouping is part of this.
Filtering is part of this.
A good reference is the survey of Ivica Aracic. Interactive Grouping and Filtering in Graph-Based Softvis

Saturday, March 8, 2008

...

Adobe AIR - runtime that allows applications to cache work locally when the network is down and sync later when the connection is reestablished. This might be one powerful way of developing cross-platform java/flash applications. And yet again, the solution is online... This is why it is good to go with the SPO + Softwarenaut online. Sooner or later, somebody else will do this.

archjava

ArchJava - 2002 - 233 citations - extensions to the java programming language that support the definition of components and ports and allow these components to be interconnected. the idea is nice. however, it seems that the webpage of the tool was not modified since 2005. what does this mean?

- For softwarenaut this might mean that it would be cool once that we have an "authoritative" view on the system to discover the common denominator and the exhaustive interfaces of various modules. ?any other ways? remove outliers?

Hierarchical Reflexion Models - Kosche and Simon. Techniques such as reflexion models could easily be super-imposed on top of the e-softwarenaut.

Monday, March 3, 2008

...

everybody makes movies as spielberg. because spielberg risked and was succesfull and now everybody copies him.

http://www.youtube.com/watch?v=4SLGT_bXe38&NR=1

be brave in terms of what you expose yourself to
be brave in making yourself foolish
be brave in making mistakes
be brave ...

mining software compilations

Mining large software compilations over time: another perspective of software evolution - analyzing software compilations.

in the same category is also the work of D. Germain presented at WCRE 2007

why not architecture

Browsing through the "Distributed Systems" book of Tanenbaum and Van Steen. In chapter 2, where they talk about software architectures, they talk about things much more level than whan reverse engineering calls architecture. They talk about object based, data-centererd, event-based, layered architectures. What in reverse engineering people call architecture, they would probably call components (at most).

From this point of view, the thesis should not be called anything architecture, but rather something like Understanding the macroscopic evolution of software. The title is cool because it easily accomodates the various levels of macro at which one can analyze the software. One macro-level is the project-level. The other level of macro is the super-project/super-repository/project-portfolio level.

In fact, it is in the context of software super-repositories that the architecture makes more sense. If I am a consultant, I don't need to reverse-engineer a system to find out its architecture. Probably people can tell me that it is a layered and an event-based system. On the other hand, if I am looking at Sourceforge, it becomes very interesting to have the possibility of detecting the projects that have an event-based architecture.

In this context, also the cool table makes sense. The granularity levels are the macroscopic levels of software. The time dimension is the evolution. And then the thesis will be about understanding software evolution at all these levels: what are the types of analysis that work at the corresponding levels, who are the stakeholders that are interested at the various levels.

Investigate further:
- the analogy with the fashion and culture. Fashion changes very fast - it is not relevant at the macroscopic level. culture changes slow, and it is the things that matter.

Open questions:
Where do the requirements fit into the picture? How do the tests fit?

lens

Listening to danah boyd on the IT conversations podcast. The metaphor of lens. It has already been used in software evolution research in EvoLens

Evolens

There are a few other interesting concepts in the EvoLens paper
- focal points: can exist at any abstraction level. one for every view. reduce the amount of graphical elements that are being visualized.

masters

One can not serve both the two masters of Time and Quality. For either he will respect the one and sacrifice the other, or he will be devoted to one and disregard the other.