Tuesday, March 15, 2011

A call for identifying your custom changes

Reviewing code changes is difficult at best, but when reviewing a "basic_change" in Magik, it would be very helpful if the changes were very clearly bracketed with private comments.

For example, you might have an #EMQ like this...

_method some_class.some_method()
## some_method() : some_return_value
##
## does something crazy

_global a

(a.b[c], a.b[d]) << (a.b[d], a.b[c])
_endmethod
$


... which you might want to refactor to be more robust. It would be really, really helpful to future code reviewers if you very explicitly indicated which lines of code you changed. Like this...

_method some_class.some_method()
## some_method() : some_return_value
##
## does something crazy

_global a

# BEGIN CHANGES
#(a.b[c], a.b[d]) << (a.b[d], a.b[c])
_try _with cond
(a.b[c], a.b[d]) << (a.b[d], a.b[c])
_when error
write("!!! EMQ !!!",cond.report_string)
_endtry
# END CHANGES
_endmethod
$


You can see that I actually commented out the original line and then put a copy of the line back in the _try block. This makes it very easy for a reviewer (and text diff tools) to see what has changed. I would not recommend using this approach for your own product code. Source Code control systems can identify these changes for you. What I am more concerned about is cases where you have redefined a core Smallworld method to address some business requirement. Once compiled into the image, it is very difficult to find the original version of the method, so the more that you can document as a customizer, the better it will be for future code readers.

Igor over at MagikEmacs pointed out to me that Alt-c in his extended version of Emacs does exactly this "changed code block" highlighting for you. You can see an example in his Screen-cast #3 at 3:50 where he uses exactly this technique.

5 comments:

Alfred Sawatzky said...

Igor over at MagikEmacs pointed out to me that Alt-c in his extended version of Emacs does exactly this "changed code block" highlighting for you. You can see an example in his Screen-cast #3 at 3:50 where he uses exactly this technique of

Anonymous said...

I concur with Alfred on delimiting the changes. I also like to put a date on the change so someone can say, "two months ago it worked, but not now".

To be a bit more pedantic, I like to duplicate the original code, then comment out (F2#) and add a blank line or two so that if the core code had something commented out, it would be double commented out, removing any ambiguity of whether or not it was executed before I modified it.
# # (a,b) << (b,a)

Bruce

Duncan MacGregor said...

Out of curiosity, how much customisation do you guys do in place, and how much do you do by taking a copy of a method and putting it in another file? I ask because I think you might be able to simplify the process of customisation by placing the original product source under version control, modifying in place, and using the tools provided by that version control system to track the times and reasons for changes.

Recent versions of emacs have good support for git and mercurial and will let you view an annotated version of a file (C-x v g) and then jump to the log entry for a line (l), and graphical tools like tortoisehg have similar functionality.

Of course patches will make your life harder, using F4 e to compare a method with the version loaded in the image is extremely useful.

Alfred Sawatzky said...

Hi Duncan,

I don't know how to provide a quantitative answer for how much customisation is done in place. I have asked around and a common thought is that core product directories should not be touched... even if using a source control system.

I have seen approaches that involve putting a "basic_changes" or "basic_additions" directory in each module. I have seen other approaches that involve "custom" modules that require the core modules and these custom modules have overwritten methods, etc.

My personal favorite is to define every customisation as essentially a defect in the core product. And if it is a defect, then it should be fixed using the patch mechanism. You can read my rant here.

But your comment inspired another thought. Do you think GE would open up the public methods/classes in a GIT repository? That way each customer would have their own clone of the entire repository. They already have their own clone now... except it was delivered via CD/DVD instead of over the internet. And because each customer has their own clone they could make their own changes directly in their own repository. If they had changes that they thought should be sent back to GE, they could use the git "push" back to a GE integration manager who could then decide whether to include the code in the core product. When it comes to upgrade time, each Smallworld customer could do a "pull" and the git framework would handle a lot of the "what changed" questions. Maybe you could even eliminate TSB deliveries because all changes would be made in the master repository and each Smallworld customer would simply "pull" when there is new code to be had.

I can see alot of advantages to setting up a distributed code management system linking GE with its customers. It would simplify upgrades (I think), simply TSB releases, encourage contributions back to GE.

Anyone interested in continuing this discussion?

Alfred

Alfred Sawatzky said...

From Reinhard....


I do the commenting even more pedantic than Bruce.

I’ve some EMACS hot keys to generate

# NEW

# END

# OLD
# …
# END

Or both combined.

I put these comments in the first column to be very obvious.
And I make the comments in a way that everything that is *not* between ‘#NEW ... #END’ (including the parts between ‘#OLD .. #END’) is exactly the original code.

In many cases I was very lucky to have these comments. There have been cases when changes haven’t been ported correctly to new base versions and that comments help to identify such situations, and – in lucky cases – which has been the Original intent of the change.

Just an example:


a_top_level_geom.special?
_then
# OLD: (downgrade von P09915801_1.magik)
# component_1 << a_top_level_geom.user_component(1)
# END
# NEW:
_for each_component _over a_top_level_geom.user_components()
_loop
# get any existing dimension component and leave
component_1 << each_component
_leave
_endloop
# END

a_status << component_1.status
...