7

This is part of a series of questions which focuses on the Abstraction Project, which aims to abstract the concepts used in language design in the form of a framework. There is a sister project to Abstraction called OILexer, which aims to construct a parser from grammar files, without the use of code injection on matches.

Some other pages associated to these questions, related to structural typing, can be viewed here, and ease of use, found here, a question on writing the compiler compiler can be found here. The meta-topic associated to an inquiry about the framework and the proper place to post can be found here.

One of the steps into this process was building my own ECMA-335 Metadata parser. Since I tend to build tools to build tools, I recently decided to do an analytical top-down on the .NET Base Class Libraries. The future goal will be utilizing it to construct libraries which represent the BCL to aid in rapid code generation (more on that later if someone cares.)

I wanted to get a basic idea of the overall versions of libraries within the .NET BCL, so I analyzed the Framework folder stored locally. Doing this per framework version should reflect the end-user's system: right now I'm targeting myself.

The first step in this goal is to construct a class which conceptualizes a multi-versioned library and pulls in each iteration as needed. The next step is pulling in all the types from each version and comparing the individual lists of types (oddly more work than it should be.)

The result is a fairly straight forward view of the BCL which breaks down the version a library was introduced, and displays which version new types were introduced, grouped by namespace.

This should be an exhaustive list which excludes types which were introduced with the library itself. You can assume that if it isn't listed, it was there from the library's inception. The view is very simple: +Type means it was added, -Type means it was removed Type->Assembly means it was relocated at that version.

The objective side of this is: if it's accurate, it gives you a basic breakdown of when a type was introduced, useful for determining the lowest version of .NET needed for a given piece of functionality. The next step is member analysis which will make the first step look easy (mostly due to method signature comparison, since each parameter type will have to be reversioned and looked up.)

I'm curious if this analysis could later be used for an automated tool which could determine the lowest common framework version for an assembly as a whole (software being written by the compiler framework I'm writing, that is)?

Insight welcome.

PS: To those wondering why I don't use Reflection, attempt using Type.GetType(string) on any of these:

Microsoft.Win32.Registry, mscorlib, Version=1.0.3300.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
Microsoft.Win32.Registry, mscorlib, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
Microsoft.Win32.Registry, mscorlib, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089 

It won't fail, but you also won't get what you would expect, even trying to load the 2.0 mscorlib by filename yields the 4.0 assembly.

Edit: I should add this analysis purely focuses on the publicly exposed elements. The rationale for this is you shouldn't ever focus on the dependencies of the internal structure of the BCL since it can change without warning. If you're writing a code generator and the expressions you synthesize rely on these types, your system is likely to be fragile as glass if something changes in the framework, like a patch that updates the 2.0 version's internal structure. Due to how .NET binds its members in Common Intermediate Language, this is a very real likelihood.

  • Was there an actual question in here? Mono.Cecil is a nice library for analyzing assemblies that may not load in the current runtime (works on .NET and not just Mono), though IKVM.Reflection is used more often for compilers because code can be shared with System.Reflection.Emit. Removing public members is a breaking change, and the idea of a "minimum" framework version implicitly assumes there will be no breaking changes. So it seems like all you really need to know is when each member was introduced, and take the maximum to find your common framework version. – Esme Povirk Sep 28 '13 at 03:15
  • I'm writing my own compiler framework, into that: to understand the systems involved I'm rolling my own ECMA-335 metadata parser similar to Mono.Cecil, I wrote a generator to handle [metadata table parsing](http://www.abstraction-project.com/text/CliMetadataTypeDefinitionTableReader.html). Loading all four versions of the framework is fairly straight forward. The question is for people who have written library analysis tools to give me insight towards what kind of functionality can be extrapolated from versioning analysis, _beyond_ what's been stated already. – Allen Clark Copeland Jr Sep 28 '13 at 05:02

1 Answers1

0

From what I can tell, this analysis can be used to discern if the target framework has the functionality put forth on a given assembly. If, for example, you specify version 2 of the framework, but rely on version 4 functionality, a detailed list of what's not valid could be created. LINQ is a prime example of this, since it was introduced in version 3.5 of the framework. I'll likely be including processor architecture/platform to the assembly identities so I can compare different platform versions of the .NET BCL to the primary BCL. The individual changes are listed online but getting a comprehensive list seems difficult.

This can be done similarly to analyze the active intermediate assembly versus the previous build to potentially detect breaking changes (such as adding methods to an interface.)

Could also be used to help automate change logs, and generate an iterable view of a library for analysis tools to build off of.

I mention LINQ as a framework element, because on the compiler level, it could be supported as early as v2.0 of the framework, so long as the method patterns are present on the types involved. Even things like async are available as early as that so long as the compiler has a means to express this functionality, if the user provided a Task<T> that met the pattern requirements, you should be okay.