Delegates as an alternative to single-method interfaces

I noted some time ago that one can use delegates instead of interfaces that contain one method. In fact, I think delegates are more flexible when compared to single-method interfaces and I'll try to explain why.

The scenario

As a reminder, the discussion started with this post: Making ICollection inherit from more fine-granular interfaces. What was the problem? I wanted to design a method that adds a large amount of data to some collection. The method would possibly serve all sorts of different clients (callers). A very reusable solution would be to just return the items so that they can be added by the client (on the caller's side), but that would mean:

  • Clients have to do additional work of adding the items to their collections (duplication of code with the same intent)
  • This also costs at least double performance (each element is first put into a temporary storage, and then put into the target storage, which gives us O(2n))
  • The method had to accumulate the items before returning them (O(n) memory consumption for the returned values (List<T>, ReadOnlyCollection<T> or whatever)
  • Even in case of yield return, where O(1) temporary storage for return values is required (values are being returned lazily one-by-one as they are needed), you'd still have to add those values to the target collection, sooner or later

That's why I decided to go for a void method that accepts a collection where to add the items (to avoid performance penalties). My first implementation accepted the ICollection<T>, the .NET framework's most probable candidate for a thingy that has an 'Add' on it. However, ICollection<T> is not good enough, because not all the types that have 'Add' on it implement ICollection<T>. There was some buzz on the blogosphere about duck-typed implementation of collection initializers in C# 3.0 - a good example of a place where we definitely lack the ISupportsAdd<T> interface or something like this.

The Solution with Delegates

To make long story short, here's a what I think a good solution to the problem. Instead of having the interface:

interface ISupportsAdd<T>
void Add(T item);
it is enough to use the System.Action<T> delegate. Compare the old way and the new way:
void OldAdd<T>(ISupportsAdd<T> collectionToFill)
// ...

void NewAdd(System.Action<T> addMethod)
// ...
And on the caller's site:
// The target collection can be anything
// that has an "Add" method
// (which can even be named differently)
List<int> resultsToFill = new List<int>();

// Don't pass the entire collection
// when you are only going to use
// the Add method anyway!
// Only pass the Add method itself:

See, that easy. Just pass the "pointer" to the Add method of your choice and the NewAdd method will call it as many times as there are items!

The important lesson I've learned from this: to make a piece of code reusable (in this case, a method), one should only provide as little information to it as it is necessary.

Earlier I tried to do that subconsciously by demanding more "fine-granular" interfaces - only to provide as little information about the type as necessary (we were not going to use Remove() anyway, so why provide it?).

Delegates - "interfaces on a member level"

Now I've realized, that in case where you only need to use one method, you can go even more granular - no need to pass the type, just pass a member (members are more fine-granular than types). See how it already starts to smell a little like duck-typing? You don't have to declare a necessary interface, your existing type signature is just fine, as long as you provide the necessary member. This is an awesome bonus feature in C#. Unfortunately, it is only limited to one member, and even more, that member has to be a method! (No delegate properties, pity, isn't it?)

Delegates vs. single-method interfaces

To summarize, there has been ongoing talk in the blogosphere about adding an implementation of an interface to an existing type without changing the type itself. There is a desire to automatically or manually make the type implement some interface, as soon as it has all necessary members with the right signatures. C# currently offers a solution, but only if your "interface" consists of exactly one method - you can then use a delegate to that method and pass it around. Delegates in this case are semantically equivalent to single-method interfaces - you can pass around both and you can call both. But delegates are more flexible, because you can take a delegate ("pointer") to an existing type (its method actually) without changing it, and you can't make an existing type implement your interface without changing it.


DG 1.0 - free Dynamic Geometry Software for kids

Dynamic geometry is a piece of software for exploring interactive math constructions directly on your screen. It lets you create any ruler-and-compass constructions using your mouse or stylus. It also allows you to play and experiment with the drawing using drag-and-drop (this is not possible on paper).



The points in the drawing are actually draggable - as you drag a point, the triangle will change with the rest of the construction. You can create hyperlinked and commented interactive documents:

and live measurements:



If you're interested (or you have kids who currently study geometry), you can download DG 1.0 from my website: dg.osenkov.com. The software is free and no setup is necessary, just unzip the archive to any folder and run Geometry.exe. Please let me know if you have any questions or feedback, I'd be happy to help.


A bit of history

For the curious ones, DG is a piece of software I wrote in 1999-2002 when I was working for the Kharkiv pedagogical university in Ukraine. A great team of teachers, methodologists and mathematicians worked on this software package, which includes more than 250 interactive drawings to support the school geometry course. I was the only programmer on the project with about 5 PMs and 5 testers :) It is great to know that DG is still being used in many ukrainian schools in geometry lessons, after we shipped the software in summer 2002. The idea of dynamic geometry dates back to 1980's, there are many dynamic geometry programs, so I'm not saying we invented anything new. But we implemented DG to specifically suit the needs of ukrainian teachers and students, and it also turned out to be a pretty good piece of software for everyone else to use. So I decided I'd blog about it, so maybe it could be of some use to someone.


Although it has been more than 5 years after we shipped DG, I still sometimes receive feedback from teachers. I'm always happy to see anyone using DG, because I put a lot of soul and effort into this project at that time. Thanks to this project, I grew from being a geeky teenager that I was in 1999 into an experienced developer who shipped and deployed working and tested software in 2002 that still works in 2007 :) I'm very thankful to our project leader, Sergey Rakov, who I also respectfully consider my personal Teacher.


Please use File.ReadAllText and the like

Starting from .NET 2.0 the basic file read/write API is finally implemented how it should be. I noticed that some people are still using the old way (StreamReader & Co) so I highly recommend the static methods

  • ReadAllText
  • ReadAllLines
  • ReadAllBytes


  • WriteAllText
  • WriteAllLines
  • WriteAllBytes

on the System.IO.File class.

Thanks :)

Update: I'd like to correct myself. Thanks to kerneltrap (see comments below) I'm not suggesting people should always use File.ReadAll* as opposed to the stream based approach, but this makes a lot of sense for simple and small files you'd normally read in one step, for the code readability reasons.

Of course, if you are doing non-trivial file operations, using using+StreamReader/Writer is better for performance reasons, and you don't have to load the entire file in memory this way. I was just seeing some cases where it was needed to read a small (10 KB) file into a string, and StreamReader.ReadToEnd was used for this purpose. So I'm sorry for not clearly expressing what I actually meant :)

kick it on DotNetKicks.com


Extension methods: one of my favorite C# 3.0 features

With C# 2.0 I sometimes had a feeling that something bigger is coming. As it now turns out, it was C# 3.0 (surprise!). Now the whole picture makes perfect sense, all the features start playing together seamlessly and I have to say that it is an awesome release. LINQ is a fundamental breakthrough and all the machinery that was required to implement it is great as well. I especially like extension methods, because they allow me to write more concise and expressive code than earlier.

Enhancing existing APIs

Before C# 3.0, the creator of an API or a framework was solely responsible for the readability and the usability of the API. Users of the API had to consume available methods as they were, unless they were ready to build own facades, adapters or decorators. Now the clients of an existing API can significantly enhance its readability, expressiveness and usability without even changing a single bit about it.

Addind default implementation to interfaces

Another advantage is similar to type classes in Haskell - ability to add some default implementation to existing interfaces. The best example for this is IEnumerable<T>, which now boasts so much more functionality than earlier! And note: we haven't changed a thing about IEnumerable<T>, we just added some stuff somewhere else, and all IEnumerable<T> implementations suddenly became so much more powerful. As we know from my previous posts, this is like adding 'mixins' to the language - just implement the interface, and you get this default functionality for free.

Composability and fluent interface

Also, an important advantage of extension methods is that they can be called on null objects without any NullReferenceExceptions. The best example for this is of course the string.IsEmpty() method:

        public static bool IsEmpty(this string possiblyNullString)
return string.IsNullOrEmpty(possiblyNullString);

This simply boosts composability because we don't have to check for null before actually calling a method on an object - we can do it inside the extension method. Earlier, callers were responsible to check for a null object every darn time they wanted to call an instance method on that object. Now, if we can gracefully handle a null object without throwing (e.g. just return null), it saves so much effort on the caller's side. As I said, our calls become truly composable, for example, I can chain calls to Object1.Method2().GetObject3().TryAndGetObject4().OhAndPerhapsTryGettingObject5AsWell() without worrying that one of the methods might return null. If it does return null, the whole chain won't throw and will just return null. Of course, there can be a lot of pain debugging this thing because we don't save intermediate results, but this is still a huge enabler for People Who Know What They Are Doing. This composability is a also a nice helpful feature for designing Fluent API (speaking of fluent API... what about extension properties??)

Extension methods in action: populating and grouping TreeView items

Now let me show a couple of delicious examples. I had an interesting problem recently. There was a list of objects (say Cars), each of the objects had four properties. Let's say like this:

    class Car
public string Make { get; set; }
public string Model { get; set; }
public int Year { get; set; }
public Color Color { get; set; }

Now, I had to add all Cars from the list to a WPF TreeView control, and hierarchically group cars by Make (similar makes go under a single node), then by Model, then by Year, and finally by Color. This is actually a pretty good interview question, and as I usually totally suck at interview questions, I started building an overly complicated strongly typed object model to represent a MakeGroup (based on Dictionary), which contains a ModelGroup, which contains a YearGroup etc. etc. I would then use LINQ to query over the master car list, group by make, for each group create a MakeGroup object, etc. etc. After building the object model was done, I would walk it with a special TreeViewItemBuilder that would build the treeview items and nest them correspondingly. After about 1,5 hours of this mess (unfortunately, interviews usually last less than 1,5 hours) I got struck by a what I think is an excellent idea. I wrote this extension method:

    public static class WPFExtensions
public static TreeViewItem FindOrAdd(
this ItemsControl parent,
string header)
// try to find an existing item with this name
TreeViewItem result =
.FirstOrDefault(x => x.Header.ToString() == header);

// if not yet there, don't throw, just create it in place
if (result == null)
result = new TreeViewItem { Header = header };

// and return
return result;

Now, I threw away my overly engineered "solution" and came up with this instead:

            foreach (Car car in GetCars())
.Items.Add(new TreeViewItem { Header = car.ToString() });

These are 8 lines instead of 200+ I wrote first. This is much more manageable and flexible than my original solution - you can easily change the code to e.g. first group by Model, then by Name - by swapping two lines of code. If you're interested in the full source code of this sample, just leave a comment and I'll post it here (that's what I call lazy sample upload - long live the functional approach!). Beside extension methods, this sample also demonstrates: auto-implemented properties, object initializers and lambda expressions! Try and spot them all!

More examples

I'll go ahead and post more examples of extension methods and how they help me write cleaner code everyday.

1. Collections

It's nice to be able to check if a collection doesn't contain items in a single step:

        public static bool IsEmpty<T>(this ICollection<T> possiblyNullCollection)
return possiblyNullCollection == null || possiblyNullCollection.Count == 0;

ICollection<T> lacks AddRange:

        public static void AddRange<T>(this ICollection<T> collection, IEnumerable<T> items)
foreach (var item in items)

Or adding several items separated by commas:

        public static void Add<T>(this ICollection<T> collection, params T[] items)
foreach (var item in items)

It's nice to be able to call a given method for each element:

        public static void ForEach<T>(this IEnumerable<T> collection, Action<T> action)
foreach (T item in collection)

Or have a composable GetValue on Dictionary that doesn't throw:

        public static V GetValue<K, V>(this IDictionary<K, V> dictionary, K key)
V result;
if (dictionary.TryGetValue(key, out result))
return result;
return default(V);

2. Strings

For strings, everyone already seems to have developed their own little library. So I think I won't be original if I share these:

        public static bool IsEmpty(this string possiblyNullString)
return string.IsNullOrEmpty(possiblyNullString);

public static void Print(this string str)

public static void MessageBox(this string str)

I also found it to be very convenient to have the following (trivial implementation, doesn't throw):

        public static int? ToInt(this string textNumber)
public static bool? ToBool(this string boolValue)

These guys rock because they easily compose and don't distract me with the necessity to call TryParse or such. BTW I think that the composability of any TryParse method is not very good, in case you haven't noticed.

3. XML

Last but not least, a nice enhancement to XElement:

        public static string GetElementValue(this XElement element, string elementPath)
if (elementPath.IsEmpty())
return element.Value;

XElement subElement = element.Element(elementPath);
if (subElement == null)
return null;

return subElement.Value;

It doesn't throw, which saves me a lot of effort every time I want to call it - I don't have to be afraid that Element() can return null.


Of course, there are caveats. Many people critisize extension methods for the lack of discoverability. Indeed, extension methods for a type only appear in the Visual Studio IntelliSense if you add a using declaration with a namespace where the extension methods are declared. This is currently a discoverability problem.

Also, others mention that it is difficult to read existing code with extension methods because we don't know the meaning of methods and where they are declared. My answer - use F12 (Go to Definition) in Visual Studio.

Finally, there is a versioning problem.

But still, I think extension methods are so cool and they provide so many advantages, that it is well worth using them anyway. Especially if you belong to those people who Know What They Are Doing.

kick it on DotNetKicks.com


Jacob Carpenter on named arguments in C#

Jacob Carpenter has an interesting post called Named parameters { Part = 2 }. As immutable types are very trendy these days (good!), questions arise how to initialize them. C# 3.0 has Object Initializers (where IntelliSense even kindly shows you the list of remaining uninitialized properties), but unfortunately we cannot use it because it requires the properties to be settable. Jacob came up with a pretty cool trick to use anonymous types to get a syntax similar to object initializers and still call into a constructor. This technique is somewhat similar to casting by example, where we also abuse anonymous types to reach our goal.

To further improve Jacob's strategy, I would probably turn NamedArguments into a factory and make it responsible for instantiating the CoffeeOrder object as well. One could use Activator.CreateInstance to create an instance of the CoffeeOrder type and return it directly instead. Thus, the code for initializing the object could be further reduced to:

var order1 = NamedArgs.For<CoffeeOrder>(new
DrinkName = “Drip”,
SizeOunces = 16,


John Rusk on Extension Interfaces

Sorry for not blogging recently, I was ramping up and getting started in my new role on the C# team. I love it here! The people are great, the job is interesting and I'm really proud to work in C#. There are a lot of interesting things starting here, like F# or other very exciting projects I can't talk about at this point ;-)


I posted recently about duck-typing and extension interfaces. John Rusk pointed me to his great article about extension interfaces. It was posted back in 2006 but I only discovered it recently (sorry). It was interesting to see Keith Farmer's comments on the article, especially because I'm now sitting two offices down the hall from him so I could go and ask him what he thinks about it in person. But I digress.


I like John's ideas and considerations, and I currently can't think of any caveats of implementing extension interfaces from the design point of view. From the implementation perspective, there might be a complication John warns about. To make an existing class implement an interface, you'd usually write an adapter class. Now the problem is: a reference to the adapter object won't be the same with a reference to the actual target object, and that might become a problem, e.g. when you'd like to compare objects. Being constructive, John also points to a possible solution involving CLR internal special cases like TransparentProxy (see an as-always-excellent post by Chris Brumme about it).


I joined the Microsoft C# team

A couple of weeks ago I moved from Germany to Redmond, WA to join the C# IDE team. I have become an SDET (test developer) on the IDE QA. Areas we own are the editor, syntax highlighting, IntelliSense, refactoring, Edit and Continue, Class View, Solution Explorer, etc. We do not cover any designers (such as the WinForms designer) and parts of the IDE not related to the C# language.

As you might have noticed from the contents of this blog, C# team is probably the most interesting place within Microsoft for me to work at. After my first week, I have to say that I'm really happy to be here and to meet amazingly smart and passionate people working on the stuff I'm most interested in.

I hope to continue blogging and to improve the quality of my postings, to keep up with the rest of my new team. If you're interested in that, I'll give a couple of links into our teams blogosphere. Using transitive closure, you can find the remaining C# blogs by following links from there.
  • Charlie Calvert is the community PM of the C# team and his blog is always an interesting resource to watch, because it accumulates other resources and is a recommended entry point to the C# blogosphere and community.
  • Rusty Miller is the C# test manager and has a couple of great posts about how we test C#. That way I could learn something about our team before coming here.
  • Gabriel Esparza-Romero is my manager and a great person to work with. He hasn't posted recently but I will still keep an eye on his blog :)
  • Eric Maino is also an SDET on our team and he helps me getting up to speed and ramping up.
  • CSharpening is a new blog of the C# QA team.
  • C# bloggers is the MSDN list of the bloggers on our team, unfortunately very far from being a complete list. I'll see who I have to bug here to have this list updated.
  • You will find links to many more great blogs from our team and from outside - in the rightmost column of my own blog. I highly recommend all of them, plus any other blogs I might have missed and still have to discover.
Oh, and feel free to contact me with any questions or feedback, I'd be happy to do my best to answer.



yield foreach

A while ago I submitted a suggestion to MS Connect: yield return to also yield collections. yield foreach could be a way to yield return items one by one without explicitly writing yield return in a foreach loop. This would basically be a way to flatten nested iterators like does. On the feedback page I also give an example, where this might be useful.

It turns out, the C# team has already been thinking about adding this feature in some future version (post-Orcas). Wes Dyer from the C# Compiler Team has an excellent post about this feature here.

An interesting question is: why is the foreach keyword there? Wouldn't it be sufficient to just write yield return MyCollection; and that's it? The compiler would be able to figure out if the thing implements IEnumerable, right? Well, consider what happens if the return type of the main (outer) iterator is IEnumerable<IEnumerable>? In this case the compiler wouldn't know what to do - just return the collection as if it would be an item or flatten the collection? yield foreach is a way to explicitly tell the compiler: yes, I'm sure, please emit the foreach loop for me.
Another advantage is that no new keyword has to be introduced - both yield and foreach are already reserved keywords.

For those interested, Wes links to a research paper about iterators: Iterators revisited: proof rules and implementation


Flags Enum

Imagine we have an enum:

enum BitField
ZeroBit = 1,
OneBit = 2,
TwoBit = 4,
ThreeBit = 8

I thought about syntactic sugar to simplify working with bits.
Consider the following:

BitField b = BitField.TwoBit | BitField.ThreeBit;

Testing bits:

// how about this sugar?
bool secondBitSet = b.TwoBit;
// instead of:
bool secondBitSet = (b & BitField.TwoBit) == BitField.TwoBit;

Setting bits:

// how about
b.TwoBit = true;
// instead of:
b |= BitField.TwoBit;

// Same for clearing and inverting bits:
b.TwoBit = false;
// or
b.TwoBit = !b.TwoBit;

If you like it, you can vote on Microsoft Connect: https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=247537


Why do we need "where T: enum" generic constraint

Here's an example from my code where I wish there was an "enum" generic constraint available. Basically, it is a combobox for choosing a value from enum's available values. It is automatically filled with values of the enumeration and has a strongly typed Value property.

public class Client {
// use DriveTypeCombo as a usual Combobox control
public class DriveTypeCombo : EnumSelectorCombo<DriveType> { }


// and use the Value property like this:
void Foo() {
driveTypeCombo1.Value = DriveType.CDRom;

public class EnumSelectorCombo<TEnum> : EnumSelectorComboBox
// where TENum : enum
public EnumSelectorCombo() : base(typeof(TEnum)) { }

public TEnum Value {
get {
return (TEnum)Enum.Parse(
set {
this.SelectedItem = value.ToString();

public partial class EnumSelectorComboBox : ComboBox {
public EnumSelectorComboBox() : base() { }

public EnumSelectorComboBox(Type enumeration) : this() {
this.Enumeration = enumeration;

private Type mEnumeration;
public Type Enumeration {
get {
return mEnumeration;
set {
if (value != null && value != mEnumeration) {
mEnumeration = value;

private void FillItems() {
this.DropDownStyle = ComboBoxStyle.DropDownList;

kick it on DotNetKicks.com


Making C# enums more usable - the Parse() method

I'll try and accumulate some feedback and thoughts about using enums in C#. There are several issues I see with the current (C# 2.0) enum API:

  • Methods like Parse() are not strongly typed

  • Working with flags and bits is a little cumbersome

  • There is no generic constraint where T: Enum

Today I'll start with the first post about the Parse method. Here's how to use it:

MyEnum enumValue = (MyEnum)Enum.Parse(typeof(MyEnum), stringValue);

This is somewhat ugly. Christopher Bennage proposes a solution with generics which I like (see his post). Also, there are a lot of other links about the lack of generic methods on the Enum class (see posts by Scott Watermasysk, CyrusN, Dustin Campbell etc.)

Still, a generic solution is not perfect from the readability point of view. What I mean is, wouldn't it be nice if the C# compiler could generate a strongly typed Parse method directly on the MyEnum type, so that it could read:

MyEnum enumValue = MyEnum.Parse(stringValue);

See also a nice suggestion at MS Connect: https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=96897

But the problem is, most probably you'd have to change the CLR to achieve this behavior, not to mention all the tools that think that enum types cannot have members. I'm not exactly sure, if the C# compiler can generate custom methods on enum types and whether you'd have to change the CLR for it. The reason is that enums (inheriting from the special class System.Enum) are treated a little bit different than other classes. I used ILDASM.exe to view the IL for the following code:

namespace TestEnum
public enum MyEnumEnum

public sealed class MyEnumClass
public static MyEnumClass A;
public static MyEnumClass B;
public int value__;

In ILDASM, it looks like this:

See, an enum looks internally almost like a class - maybe it is not difficult to add methods to it?

Anyway, you can vote for such features to be implemented:

However, we can use extension methods in C# 3.0! Like this:

public static T ParseAsEnum<T>(this string value)
// where T : enum
if (string.IsNullOrEmpty(value))
throw new ArgumentNullException
("Can't parse an empty string");

Type enumType = typeof(T);
if (!enumType.IsEnum)
throw new InvalidOperationException
("Here's why you need enum constraints!!!");

// warning, can throw
return (T) Enum.Parse(enumType, value);

It could be used like:

DriveType disk = "Network".ParseAsEnum<DriveType>();

Update: Thomas Watson also proposed adding an extension method directly to the Enum class in the comments here. Cool!


C# 3.0 Collection Initializers, Duck Typing and ISupportsAdd

Ilya Ryzhenkov from the ReSharper team has an interesting post: C# 3.0 Collection Initializers - Incomplete Feature? The problem is:

... restrictions are too strong - type being constructed should implement IEnumerable and have instance method "Add". IEnumerable is not of a big deal, but inability to use extension methods for Add is deal breaker.
I think that Ilya makes a very good point and that this feature indeed could be made more universally applicable. However, it is important to clearly understand the semantics of this feature to use it correctly - developers should know precisely what happens behind the curtains, otherwise unexpected side-effects can occur.

Now let's brainstorm a little bit. Recently I wrote about more fine-granular interfaces. I exaggerated a little, but the idea was that an interface should define a minimal contract. Specifically, I expressed regret that there is no ISupportsAdd interface, because adding stuff is a widely used ability of many entities, not necessarily collections.

I think it would help a lot if we had an ISupportsAdd/ISupportsAdd<T> interface with the only void Add(T item) method (you could also name it IFillable, IAllowsAdd, ICanAdd, you name it). For places in the framework where Add methods are named differently (e.g. AddPermission), this interface could be implemented explicitly.

This leads us to another problem: there are already lots of shipped types that do not implement this "ISupportsAdd" interface. How to add an implementation of an interface to a type without changing the type (and its declaring assembly)? We'd need something like extension methods ("extension interfaces", anyone?). Orion Edwards makes (in my opinion) a terrific suggestion:

Define an alternative to an interface called "requirement". It would work and behave exactly the same as an interface EXCEPT that it would use duck typing instead of static typing. For example:
public requirement Closeable
void Close();

public void TestMethod( Closeable c )
TestMethod( new System.Windows.Forms.Form() );

Collection initializers are another case where duck typing is sneaking into the C# language. Guess what the first case is? Right, the foreach statement. I was surprised, too, when I read a post by Krzysztof Cwalina about Duck Notation. It turns out, foreach doesn't necessarily require IEnumerable - it can also use duck typing to recognize iterable entities.

However, I'm not a professional language designer and I have no idea how duck typing would behave in a statically typed language. The known problem with duck-typing is that it orients to spelling of members, and not the semantics. There could be cases where members are named equally but have totally different semantics, so duck-typing would destroy the benefits of static type checking by allowing semantically incompatible assignments. But with this explicit "requirement" duck typing, who knows, maybe it's a good idea.

I'm just saying that I already had a lot of cases where I regretted that a shipped type doesn't implement some interface - and I couldn't add this interface to the type declaration because I'm the consumer, and not the producer of the type. I believe that "Extension interfaces" or "explicit duck typing" would really help here.

P.S. Oh, and can anyone explain why do we need to implement IEnumerable to be able to use collection initializers? I'm probably overseeing something obvious, but I thought I'd take the risk of sounding stupid and ask anyway :)

On duck typing in .NET see also:


Compiler as a black-box

I'd like to share some personal thoughts and observations about compilers and their integration into an IDE. This topic might be interesting for those who design programming tools and IDE add-ins. A disclaimer: I'm not an expert in IDE design, but I'm learning, so if you have something to say, I welcome your feedback. Probably I'm saying trivial and well-known things, so please bear with me.

Observation number one - in some IDEs, a compiler is a black-box, meaning the interface between the compiler and the IDE is text-based: source code is input, and binaries are output. Compiler error messages and warnings are returned as plain text, with line and column numbers which indicate the position of an error in the source code.

This would be a good approach if we were to bind the compiler to a simple text editor - one could swap in another editor or another compiler without even recompiling the whole system. But nowadays, the compiler functionality is also required outside the command-line compiler - for features like code completion, refactoring, etc. I think it doesn't make sense anymore to encapsulate the compiler in a black-box with input and output, but instead, to expose the compiler internals to the rest of the IDE and tools.

The classical Dragon Book pipeline splits the whole compilation process into phases (scanner - parser - resolver - code generator). Each phase has input and output and presents a more fine-granular black-box. For example, the parser receives a stream of tokens and outputs an abstract syntax tree (AST). With this architecture, we can plug-in additional steps (e.g. tree transformations) between the compiler phases, such as the resolver and the code generator. This would be highly useful for tools that want to extend the language or the IDE (AOP, design-by-contract, code generation etc).

However, some compilers hide this pipeline from us, encapsulating the compilation process in a single black-box. Someone even coined the term monolithic compiler. Many tools developers express a growing need for compilers to expose these internal compilation steps and to provide hooks to plug-in custom functionality into the compilation process.

Once exposed through classes and interfaces (compiler API), one could apply various design patterns to extend and modify the compilation process. For example, one could wrap the parser into a decorator, which will perform additional tree transformations, or append additional output to the code generator. The whole compiler would provide a factory that would produce scanners, parsers, code generators, etc. One could plug one's own parts into this factory to replace defaults.

This is one important step towards compile-time reflection (which would enable things like syntactic macros, quasi-quotation and metaprogramming in general). My favorite example of these technologies is the Nemerle language.

An advantage of exposing a compiler API would be the reusability of the compiler functionality. With a monolithic compiler, one would need to duplicate functionality to implement code completion, refactoring etc. An IDE would be full of places of round-tripping from code to AST and back. With an extensible compiler, the AST would become the main data structure of the IDE. Soon I plan to blog more about AST as the primary data structure of the IDE, as opposed to the source code as text.

Please note, I'm not talking about any concrete IDE implementations, because I haven't had a chance to look at them more closely. From what I've heard, Eclipse does a pretty good job at sharing its AST with plug-ins. In some future post I'll talk about SharpDevelop - the IDE I had some experiences with.

In the meanwhile, here are some links for those who found this topic interesting:

I'd love to hear your opinions and feedback on this. Thanks!



Today just a quick post about language design.

SecretGeek pointed me to this article: Create Mixins with Interfaces and Extension Methods and I really liked the idea. This reminded me of Haskell's type classes, where you can implement part of the interface based on the other part. Later, you don't have to implement the whole interface, the "default" part gets implemented automatically.

Generally, I'm fond of the idea of traits/mixins, this would be useful in quite a number of situations, especially when you'd like to share some functionality across different class hierarchies. Here's a popular link to the research about traits: Traits — Composable Units of Behavior

Update: I found more interesting links about this:


Static analysis and source code querying

Professionally, I am very interested in developer tools, especially how to develop them in a proper way. One kind of developer tools are those that let developers analyse the code to extract some statistics or other characteristics about it. This is called static analysis because the information about the code can be obtained at compile-time, when the code is not even running yet.

Today, I'd like to write about three tools in this area that I'm interested in.

1. SemmleCode

Released by Semmle as a free product, this Eclipse plug-in allows to write and execute queries against the source code base using the .QL query language. The .QL language is a specially developed SQL/LINQ-like query language an interesting property of which is extensibility and object orientation. An example:

from Field f
where f.hasModifier("public")
not f.hasModifier("final")
select f.getDeclaringType().getPackage(),

This query returns all public non-final fields, and for each field it also returns the type and package where the field is defined.

How can SemmleCode be useful? The website gives six mainline usage scenarios:
  1. Search and Navigate code
  2. Find bugs
  3. Compute metrics
  4. Enforce coding conventions
  5. Generate charts and graphs
  6. Share your queries

How does SemmleCode work? First, it walks the entire source code and parses it into an intermediate representation. Eclipse is kind enough to provide tools with a Java parser and full access to the AST, so Semmle folks didn't have to write their own Java parser. Note how great it is, when the IDE takes so much care about its tools and lets them warmly become part of the IDE family.
Anyway, then SemmleCode dumps the AST into a relational database, whereas only class and member information is being stored. Currently they don't go down to the statement level and mostly do inter-procedural analysis (not intra-procedural). However, method calls still land in the DB, which is a good thing.

When you execute your query it's being internally rewritten in Datalog, a dialect of Prolog. Prolog is a terrific eye-opener and deserves a separate post in the future. Finally, Datalog is being converted to very efficient and highly optimised SQL, which is then run against the DB engine.

To sum up, Semmle emphasizes flexible arbitrary querying against the code model. This is a little bit different usage pattern if we compare it to checking against fixed and predefined rules, like for example FxCop does. SemmleCode is more about discovery and analysis, while FxCop is more about automated quality control and checking.

That's about it. The tool is great, .QL is expressive, and Semmle is moving forward with promising regularity. Watch them at QCon in San-Francisco later this year.

2. .NET tools

OK, Eclipse is good, but what about the rest of us, .NET folk? Well, first there is NDepend, which I still haven't had a chance to look at (sorry Patrick!) But it looks like a good tool, I should definitely give it a try in my spare time.

Then, there is FxCop, the widely used one. FxCop contains a library of distilled developer experience formulated as rules. The code is checked against the rules and FxCop annoys developers until they either fix the code or finally lose their temper and just turn the offending rule off :) It is noteworthy that FxCop doesn't parse the source code - it goes in the reverse direction and analyses the compiled assemblies.

But today I'd like to specially write about NStatic, which is a promising tool I'm really excited about. Wesner Moise is the talented developer behind it, who applies AI and algebraic methods to code analysis. As of now, NStatic hasn't been released yet, but I'm closely watching Wesner's blog, which is a real wealth of insightful information. Beside that, Wesner seems to like the idea of structured editing, which also happens to be my own passion.

3. Sotograph

Last, but not least, another product which I'm interested in - http://www.software-tomography.com. This tool emphasizes visualization of large systems and the metaphor behind comes from medicine. Just like tomography allows to peek into the human body to see what exactly is wrong, Sotograph allows to visualize large software systems to analyse dependencies and find architecture flaws.

Software Tomography recently introduced a highly-efficient C# parser specially developed at the University of Linz, Austria - home of Prof. Hanspeter Mössenböck, the creator of Coco/R, a .NET parser generator. This is also a good topic for a separate post.

One possible usage scenario for such tools could be determining dependencies between subsystems, for example, when planning a large refactoring or other massive code changes. Static analysis tools allow us to peek into the future and see what dependencies are going to be broken if I do this and that. We can also conduct targeted search using source code querying. Whatever we do - we do it, in the end, to increase code quality and plan for future maintenance and scalability.

Update: see also my del.icio.us links about static analysis: http://del.icio.us/KirillOsenkov/StaticAnalysis


A usage scenario for empty "marker" interfaces

There is a well-known advice (originating probably from the FDG book) to avoid interfaces with no members. Such interfaces are mostly used to mark types, and testing if a type is marked is done with the "is" operator like this:

if (myObject is INamespaceLevel)

It is being offered to use attributes instead, for example, like this:

if (!obj.GetType().IsDefined(
typeof(ObsoleteAttribute), false))

The advantages of attributes are:
  1. they can have parameters
  2. you can precisely control how attributes are inherited. You can easily turn the attribute off on a derived type, whereas you can't erase an interface from a derived type's inheritance tree, if a base type already implements it.

The disadvantages of attributes are:
  1. the clumsy syntax
  2. and the runtime costs of checking (reflection is slower than the is operator).

A possible usage scenario for marker interfaces
I had one situation so far where marker interfaces seem to be quite useful and look neat. Frankly, as I write this, I realize that I could have taken attributes as well, but I already wrote too much of a post so it's a pity to throw it away. When I started this post, I was a strong believer that marker interfaces are good, now I think attributes deserve a chance as well :-)

Anyway, here's the example that I originally wanted to post (and you judge by yourself if marker interfaces are justified here). In the C# code editor I am building, language constructs were modeled by types inheriting from the Block class, for example, like this (a small subtree of the entire AST):

Now, some blocks are allowed to be nested in other blocks. For example, a class can be nested within a namespace or another class, and a method with a body can be nested in a class or a struct. To determine, where a language construct can be used, I introduced a set of marker interfaces:

public interface INamespaceLevel { }
public interface IClassLevel { }
public interface IMethodLevel { }

Now, when we drag-and-drop or copy-paste blocks, determining if a block can be dropped within a container is easy. Each container has a list of allowed interfaces that can be accepted (not necessarily a single interface, we want to be flexible). Once we drag a block over the container, we look if the dragged block implements any of the interfaces we can accept:

bool foundAssignable = false;
foreach (Type acceptableType in AcceptableBlockTypes)
if (acceptableType.IsAssignableFrom(dragged.GetType()))
foundAssignable = true;

We fill the AcceptableBlockTypes list like this:


And here's the definition for AddAcceptableBlockTypes:

private Set<Type> AcceptableBlockTypes = new Set<Type>();

public void AddAcceptableBlockTypes(params Type[] acceptableBlockTypes)
foreach (Type t in acceptableBlockTypes)
if (!AcceptableBlockTypes.Contains(t))

public void AddAcceptableBlockTypes<T1>()

public void AddAcceptableBlockTypes<T1, T2>()
AddAcceptableBlockTypes(typeof(T1), typeof(T2));

public void AddAcceptableBlockTypes<T1, T2, T3>()
AddAcceptableBlockTypes(typeof(T1), typeof(T2), typeof(T3));

Now I wonder how sane this is and if I should really have taken attributes instead. I like the usability of the current API and it looks like the approach works fine for my editor. Now let's wait and see how it scales as I modernize the editor to support more recent C# versions than 1.0 :) I'll keep you posted.

More about Interface usage in .NET

In a recent post I shared some personal experiences about when to use interfaces or abstract classes.

As it turns out, internet is full with information and advice about it:
  • Evan Hoff has an interesting post dividing the interface usage scenarios into three groups: interfaces modeling object characteristics, capabilities and complex entities. To reiterate, the only reason I see to model a complex entity as an interface is that implementations will use more than one base class as roots of the class hierarchy.
  • Thomas Gravgaard in the post Random Ramblings and Rumblings: The Interface Tax confirms my experience about duplicating members in both a class and its interface. I totally agree with him, that an interface in this case is mostly redundant and not justified. I also learned the cool YAGNI acronym. If you speak German, I like this description more.
  • An advice to avoid marker interfaces is ubiquitous, although I still can't find any justification for it. I'll disagree with this advice in my future post.


Making ICollection inherit from more fine-granular interfaces

In my previous post I hinted about a rule which I apply to the design of interfaces:

An interface should have at most one member.

This is to emphasize, that interfaces often model abilities that are mixed in to the main entity. And you mostly need only one member to model an ability.

To illustrate this, here's an example. Sometimes I regret that System.Collections.Generic.ICollection<T> doesn't consist of such more fine granular little interfaces. It would really make things simpler in some scenarios. For example, consider a method which adds a lot of stuff into some bag:
void AddLotsOfStuff(ICollection bag)
bag.Add(stuff1); ... bag.Add(stuff1000000);

Now my problem is this. I'd also like to also pass objects that don't implement ICollection into this method. I'd only implement the Add method and I don't want to additionally implement Remove, Count, IsReadOnly etc. If ICollection would inherit from IAllowsAdd, things would have been much more simple:
void AddLotsOfStuff(IAllowsAdd bag)
bag.Add(stuff1); ... bag.Add(stuff1000000);
I'm using only the Add ability of the bag inside the method anyway, so I don't care about other ICollection abilities - I don't need them here. And now - voila - I can pass ICollections and my custom objects that solely implement IAllowsAdd.

You might ask: why not simply return stuff in an IEnumerable instead of stuffing it into a bag passed as parameter? Well, one could do that, but this would be a little slower - we need some temporary storage where we'd store the output before we actually put it in the bag outside. In time-critical scenarios you can save a lot of time if you directly put your stuff into the bag instead of first returning it and then putting it into the bag somewhere outside the method.

A more concrete example - imagine some code editor would like to show an IntelliSense drop down list box with 1000 words and you have a GetPossibleCompletionWords() method that is supposed to put a 1000 strings into the listbox. If you first return the strings, you have to accumulate them in some temporary list first:
StringList results = new StringList();
results.Add(word1); ...
return results;

This costs both time and memory. Even if you use iterators and yield return, you will have additional costs connected with storing the method's control flow state machine in the heap. It saves you a lot of time if you directly insert the words into your destination listbox and pass the listbox as a parameter. However, you have a problem: your listbox doesn't implement ICollection. But it does implement IAllowsAdd (or IFillable, or ICanAddStuff or whatever).

That's why I wish the ICollection interface of the .NET Framework would actually consist of more fine granular interfaces which could be reused in places where I don't need the entire collection. And - who knows - maybe such interfaces will be introduced sometime in the future, because splitting an interface into several new interfaces is not a breaking change - one could continue to use existing ICollection, but one also would get new, reusable interfaces like IAllowsAdd, ICountable, ISet, etc.

Please let me know what do you think about it - I welcome your opinions, ideas and suggestions.

Update: it seems that there is a nice workaround with delegates. You can just pass a delegate as parameter to the method. This delegate can point to any Add method you like. This would be even more flexible than using fine-granular interfaces. Hmm... this sounds good: everywhere where you'd use an interface with one method, you could actually use a delegate instead. The performance tradeoff would be virtual call vs. delegate invocation.

Choosing: Interface vs. Abstract Class

Interfaces and abstract classes in .NET are pretty similar in functionality - you can't instantiate them, but still objects of various runtime types can hide both behind an interface or behind an abstract class. So when to use which?

The "Framework Design Guidelines" book by Brad Abrams and Krzysztof Cwalina is a very good book for all those who like asking such questions. Particularly, there is a whole section 4.3 "Choosing between class and interface" discussing this issue. I'll spoil the plot ending and post the main advice right away:

Do favor defining classes over interfaces.

The main argument of the authors is flexibility of classes. This applies mostly to layered architectures, where a client layer uses the framework layer. Classes can provide default implementation, so even if you already have derived classes in your client layer and you're adding a new virtual method to your base class in the framework, the clients won't even have to recompile their derived classes - the new virtual method magically appears in the derived types automatically and you don't have to change anything in the client. On the contrary, if you add a new method to an interface, which is already being implemented by some clients, you can't provide the default implementation - it would introduce a breaking change (all clients would have to manually add methods to implement the new method of the interface).

This is a very good point - once you ship an interface you basically can't change it anymore - it's carved in stone starting from the very moment anyone on the client side implements it. Unless you want to break that client and make them recompile or even worse - re-deploy.

When I was developing an editor for C# it was based on a framework for editors. During design of the framework and the editor, I learned a lot about how to use interfaces (more precisely, how not to use them). Now I want to share my experiences with you to add a couple of measly cents to the dollars of wisdom of .NET architects.

My whole editor framework is built around a concept of a 'block' - a rectangular thing which can be displayed on the screen and contain text and other blocks. The framework is basically a hierarchy of classes that inherit from the Block class. If you're familiar with the Composite design pattern and the text editor example from the GoF book, you'll immediately recognize that blocks are simply Glyphs - basic elements from which documents are composed. Now, I had an interface IBlock and an abstract class Block, which implemented IBlock. As it turned out many lines of code later, this was the most stupid thing to do:

  • Anytime I wanted to add new members to the class hierarchy, I had to update both the base class and the interface

  • Once I'd ship the IBlock interface and a client would implement it, I wouldn't be able to add anything to it anymore

  • The only situation where IBlock would be useful is when some client would want to implement it without inheriting from Block - and who on earth would ever want that?

I removed the IBlock interface and changed all its usages to use the abstract class Block instead. This simplified my life so much.

What I've learned from this: interfaces are good when you want to work with some distinguished feature or ability of an entity, and not the whole entity. For example, if you want to enumerate something, it is sufficient for you to receive an IEnumerable - you don't care about the rest of the entity. It can be anything, of any complexity - for you is only important that it allows to enumerate itself. A good occasion to use an interface is having many base classes - you don't have to inherit from Collection, you have various base classes and the only thing you have to do is to implement IEnumerable. As I said, interfaces describe part of the functionality, and you can mix-in this functionality to existing class hierarchies without adding a new base class.

Classes, on the opposite, are good for modeling whole entities, entire complete entities as a whole. That was my mistake - I used an interface for the whole entity "Block", where I should have used a class instead. With classes, you can't mix in a new base class to an existing one - that would mean merging two clearly separate stand-alone entities into some weird-looking hybrid. That is one of the reason multiple inheritance is not allowed in .NET - you shouldn't mix entities together, it's not a good practice. On the opposite, you should do your best to separate your entities from one another. It's like normalization of relational databases - you split your tables until each table describes a clearly defined entity with exact boundaries and strictly defined connections and relationships to other entities.

A common design mistake is called "burning the base class" - it's when you make your whole hierarchy inherit from some base class just to add some functionality to the entire class hierarchy in one simple step. As a rule, this functionality doesn't represent a separate, stand-alone entity, but still you waste your base class to add partial functionality. As I said before, interfaces are best when you want to add part of a functionality - you can mix it in to an existing hierarchy without burning the base class. An example of burning the base class is a System.MarshalByRefObject from the .NET base class library. Clearly, we waste a base class of all Windows Forms controls just to add some ability to all the classes at once. Adding an ability is done best by interfaces. However, this way requires more typing because an interface can't provide a default implementation.

To reiterate: use classes to model whole entities, and use interfaces to add abilities to existing entities. The only exception to this rule I can currently think of - is when you don't want your entities to be extended in the future. That is, you'd like to specify a precise contract for an entity, and you can guarantee that this entity won't have additional abilities in the future - at least not in your scope. Then, it's OK to use an interface to capture the entity and carve it in stone. Clients are free to add functionality to this entity, but you don't want to see it in your framework. You'll only work with this strictly defined entity and it's not going to change.

Finally, I've discovered a rule of thumb for myself which generally helps me to use interfaces correctly. An interface should define at most one member. Yes, that easy. One could even have an FxCop rule for it. If you want to define an ability on some entity, one member is mostly enough: GetEnumerator() for an enumerable, Clear() for a clearable, Count for countable and so on and so forth. If you want a more complex ability or a composed entity, use interface inheritance:

ICollection : IAllowsAdd, IAllowsRemove, IEnumerable, IClearable, Indexable, ICountable, ISet

Purists can push this rule even further:
An interface can either:
  1. declare no members and inherit from no interfaces ("a marker interface"), or

  2. declare a single member, or

  3. inherit from two other interfaces.

This would be some minimal "interface normal form" à la Chomsky, to which any interface hierachy could be reduced. Unfortunately, this mathematically beautiful "normal form" wouldn't be practical for everyday development.

kick it on DotNetKicks.com


Structured editors

This post is about structured (or syntax-driven) code editors and my experiences in this area. Here right away some screenshots as a bait - how my current structured editor implementation looks like.

Structured editing - to be or not to be?

Structured editing is a topic surrounded with scepticism and controversy for the past 20 years. Some argue that directly editing the AST on screen is inflexible and inconvenient, because the constraints of always having a correct program restrict the programmer way too much. Others expect structured editors to be more helpful than text editors because the user operates atomically and precisely on the language constructs, concentrating on the semantics and not on syntax.

In summer 2004, my professor initiated a student research project - we started building a structured editor for C#. I took part because I was deeply persuaded that good structured editors can actually be built, and it was challenging to go and find out for myself.

As one of numerous confirmations for my thoughts, in 2004, Wesner Moise wrote:

...I see a revolution brewing within the next three years in the way source code is written.
Text editors are going to go away (for source code, that is)! Don't get me wrong, source code will still be in text files. However, future code editors will parse the code directly from the text file and will be display in a concise, graphical and nicely presented view with each element in the view representing a parse tree node. ...

I remember how I agreed with this! After three years, in 2007, the prototype implementation is ready - it became the result of my master's thesis. I still agree with what Wesner was envisioning in 2004 - with one exception. Now I believe that structured editors shouldn't (and can't) be a revolution - fully replacing text editors is a bad thing to do. Instead, structured editors should complement text editors to provide yet another view on the same source tree (internal representation, or AST).

General conclusions

As a result of my work, I'm convinced that structured editors actually are, in some situations, more convenient than text editors and providing the programmer with two views on the code to choose from would be a benefit. Just like Visual Studio Class Designer - those who want to use it, well, just use it, and the rest continues to happily use the text editor. All these views should co-exist to provide the programmer with a richer palette of tools to chooce from.

Hence, my first important conclusion. A program's internal representation (the AST) should be observable to allow the MVC architecture - many views on the same internal code model. With MVC, all views will be automatically kept in sync with the model. This is where for example something like WPF data-binding would come in handy.

As for the structured editor itself - it is still a work in progress and I still hope to create a decent replacem... err... complement for text editors. It has to be usable and there are still a lot problems to solve before I can say: "Here, this editor is at least as good as the text editor". But I managed to solve so many challenging problems already, that I'm optimistic about the future.

Current implementation

The current implementation edits a substantial subset of C# 1.0 - namespaces, types, members (except events), and almost all statements. If you're interested, you can read more at http://www.guilabs.net/ and www.osenkov.com/diplom - those are two sites I built to tell the world about my efforts. I also accumulate my links about structured editing at del.icio.us: http://del.icio.us/KirillOsenkov/StructuredEditors

Here, I'll just add one more thing. It turned out that it makes sense to build structured editors not only for C#, but for other languages as well - XML, HTML, Epigram, Nemerle, etc. That is why, at the very beginning, the whole project was split in two parts - the editor framework and the C# editor built on top of it. In some later post I'll share my experience of building frameworks.

If you want to know more, or if you want to share your opinion on this, please let me know. Thanks!

New blog about design and developer tools

I can't resist anymore. I have to start blogging.

First things first - an introduction. My name is Kirill Osenkov and one could say that I'm a software developer who develops software for software developers. That puts it quite nicely - my main interests are currently in the design and architecture of developer tools. I spent two summer internships at Microsoft working with the DSL Tools team. I learned a lot there, especially about using DSLs to raise expressiveness while modeling and programming. In July 2007 I finished my Master's Thesis, where I built an experimental structured (syntax-driven) editor for C#. And now, I hope to continue working on developer tools and share my thoughts via this blog.

Now about what you can expect here. Going meta is popular these days, in the age of DSL, intentional programming, language-oriented programming, and all those X-oriented-programmings out there. For me, going meta means reflecting on how developers develop software and how to develop good developer tools. That's why I'm interested in parsers and parse trees, compiler API, languages, syntax, language services, resolvers, editors and how to put all this stuff together in an extensible way. Clearly, I'm not the only one who's interested in this stuff.

Other than that, I'll also write about design, architecture, OOP and .NET programming.