Dotnet developers more in demand than Java?

Dotnet developers more in demand for Java? More often than not Java has a higher following being Open Source and all. Given M$ failures to deliver anything useful (practical) … Silverlight remember? Don’t mention smartphones and disappearance of start menu from toolbar all hurts career prospect of dotnet developers. However, appears after all M$ bashing (it’s still *cool* beat up M$? Apple is cool, M$ is Evil?), tides is turning. While dotnet has fewer developers and following, it has disproportionate number of jobs, particularly in more lucrative *Investment* banking industry. It’s important to work only for firms that pay isn’t it? Or you might as well work for Chinese/Indian/Korean firms which emphasize “A Lot of Result Oriented” but not so much on “Reward Oriented” culture.

http://news.efinancialcareers.com/hk-en/218206/the-hottest-programming-languages-in-singapore-and-hong-kong/?_ga=1.149779957.253477064.1438526807

(Yes there’ll always be higher/very high number of lowly paid, Shabby Web dev (blow) jobs with *limited budget*, always Limited, very Limited – employers always aim to offshore to lowest paid, “Yes-Sir” locations where dev say “Yes Sir” every morning to their boss)

There’s no time for complacent for M$ or dotnet developers however. There’re many competing technologies out there. Anything prefix “apache” for example “avro” can threaten anything home grown M$ WCF for example. Yes from this time onwards it’s *Cool* define schema via Json, not XML. Yes? (Actually this is a forward looking statement – check out Avro’s empty C# documentation section and footprint in dotnet development community? Look here: https://issues.apache.org/jira/browse/AVRO-1420
http://www.codeproject.com/Articles/1020144/Serializers-in-NET-v?msg=5112338#xx5112338xx
And if you try compile json avsc schema with avrogen.exe – well, it’s mostly undocumented space first. And if you “import” anything (which is quite essential requirement) you will quickly run into “Undefined Name”, the utility avrogen.exe (convert schema in Json/avsc files into C# classes) for dotnet (I’m sure the version for Java/Python works) is broken:
https://issues.apache.org/jira/browse/AVRO-1724
Avro was invented by Java and Open Source developers. dotnet always come last in their considerations.

M$ invested a lot on “Patterns” – Now, depending on where you’re coming from, whether you work for component vendors, or for example a bank. If you’re doing “Enterprise” applications most often what you developed have max ten, most often five years life time, then new ambitious CIO will come, scrap everything and start afresh. Further, your “libraries” will have very limited audience (in comparison say if you’re component vendor like Infragistics/DevExpress). Patterns get very little done except to get small numbers of developers to code with same Gof (Go5.x anyone?) Cookie Cutters (Important? Yes… yea). Over-engineering your applications accomplish nothing.

Like WPF to define GUI, WCF interprocess communication. Worse is Prism“What does it do for me for the learning curves it takes?” (For myself, and all developers in team).

Good Software isn’t always (50% of cases?) about writing software with same cookie cutters (repo, prism, spring, hibernate, dao) as with everyone else. Wizards do things that other people can’t. Not write code complying to Gof or Prism, or other over interpreted concepts school teaches you. Or whatever latest fades IoC blah blah (mere thoughts of that academic subjects puts me to sleep)

M$ need re-focus not to screw its developers. Stop do things like inventing Silverlight, WCF (then ditch it two years after that a joke?) . Minimize budget to Patterns and Practice team. Perhaps just embrace Android and build on top of it (If you can’t Conquer it, Use it, Exploit it)
M$ need re-focus on technology that delivers *Capability* (or Hardware/Software that simply *Looks Good*) – not imposing more rules to #dotnet dev, and screw her affliates/vendors. Just how much vendors like DevExpress or Infragistics to what extent they been screwed having developed product offerings for Silverlight which M$ herself abandoned? How much and to what extent “Your Own” career been hurted as a result of M$ stupidity recent years?

M$, Create Fan Boys, not Enemies.

Advertisements

Architecture for Financial Applications – Rethinking Object Oriented Paradigm

Multi-Tier Application Architecture isn’t a new concept to anyone who has done any sort of enterprise development to the point nobody ask about this during technical interviews anymore.
At minimum, there’re always three basic tiers whether you’re building web application or client server application:
a. Presentation
b. Application – Business logic
c. Data source
For financial applications, where do you put your calculations? That’s a matter of debate (but it shouldn’t).
I can’t tell you how many times I have seen applications gets built using the standard cookie cutter: DAO to load records from database into entities in Application tier, fashionably complying to every standard practice using OR mapper such as hibernate, Repositories and DAO with Spring and IoC container for every object. I’m not sure if people do this because they feel the need to comply with Golden OO design paradigm. Or too afraid to deviate from “Best Practice”. This pattern simply don’t apply to all scenario. Not just “edge cases”.
For starter,
a. What kind of calculation are you running? Derivatives risk and pricing? VAR? Stressing? Time series analysis, covariance calculations, factor construction in portfolio optimization? Theses are computationally intensive. Quant libraries generally in c++, Python,  Java. And typically load distributed and calculations, thus, done in “Application Tiers”.
Or are you running simple pnl updates, aggregate position level pnl/return/risk to book level? funding or trade allocation? Reconciliation? These are simple mathematics (no quant Library) : key concatenation/matching, simple aggregations. This brings us to next point.
b. Data volume, performance, and proximity to data source. If your calculations sources or operate on a lot of data, unless nature of calculation complex. Or that it requires quant libraries. There’s probably very little reason why these should be done in Application Tier. Databases are extremely good at keys concatenation /matching, aggregation and simple arithmetic. If data already in database, you’re processing more than a few thousand rows, performance gains can be realised by running these calculations in database/SQL BEFORE you move data from database tier to application tier. Even if you have multiple data sources (even message bus or non-SQL sources) : One can always build simple data feeds, consolidate into single database. Downside to this approach is, SQL not portable across different database vendors.
c. Support
If calculations done in SQL, this means production trouble shooting can be done without a debugger. What this further means is that Level One support don’t need bother developers. More importantly, fixes can be simple SQL patches – no need recompile and redeploy, which adds to the risk.
d. Simplicity, Agile, and Maintainability
Let’s keep things simple. You’re adding complexity everytime you add a bean, entity, DAO – especially you have to maintain these mundane pieces of code manually. Imagine the amount of work you need to do if new fields need to be added? Worse if additions are to be performed on multiple entities/DAO (Although we no longer need updates hbm files anymore thanks very much). Speaking of which, #dotnet is full of Agile magic. Linq-to-SQL [completely] automates the process of generating entities & DAO: https://www.youtube.com/watch?v=bsncc8dYIgY
You don’t even need hand code the entities (Unlike Java/Hibernate). Linq-to-SQL, however, does not support updates. However, one can simply delete old entity/DAO files, then drag-drop to re-create the entities/DAO in seconds. This said, the achilles heel of Linq-to-SQL is: It supports Microsoft SQL Server only! (Ouch!!!)
#msdev are blessed with Linq-to-SQL. It doesn’t follow that Linq-to-SQL is the shortest path for all scenario. Most financial applications deals with data in Tabula format – #yesSQL!. And most financial calculations are simple additions/subtract and multiplications which can be done in SQL layer. Another Agile Magic that #dotnet has (And Java has not) is DataTable – there’s nothing simpler than loading a DataTable with a DataAdapter and bind to Grid on front-end. Example: http://www.dotnetperls.com/sqldataadapter
Many times, there’s just no real need for creation of entities just to comply with OO paradigm. Just because you don’t “bean” your reports and put them in nice little coffins mean it’s Dirty coding. In fact, by having fewer source files, your code is cleaner.
In fact, emphasis of Agile development should not be about Kanban board, stand-up meetings and micro-managing developers by incompetent or non-technical Project Managers (Titles should be more appropriately renamed to Project Administrative Assistance). Instead, emphasis and focus should be exploration and utilization of available Open Source and Commercial tooling that actually do actual work for you, and do things as simply as possible leveraging such technologies.
Don’t Over-Engineer and Happy Coding!

Oh… apparently last time someone think about this was back in 2010: http://blog.jot.fm/2010/08/26/ten-things-i-hate-about-object-oriented-programming/

Reverse Engineering Data-flow in a Data Platform with Thousands of tables?

Ever been tasked to inherit, or migrate an existing (legacy) Data Platform? There are numerous Open Source (Hadoop/Sqoop, Schedulix …) and Commercial tools (BMC Control-M, Appliedalgo.com, stonebranch …etc) which can help you operate the Data Platform – typically gives you multitude of platform services:
• Job Scheduling
• ETL
• Load Balancing & Grid Computing
• Data Dictionary / Catalogue
• Execution tracking (track/persist job parameters & output)

Typical large scale application has hundreds to thousands of input data files, queries and intermediate/output data tables.
DataPlatform_DataflowMapping

Mentioned Open Source and Commercial packages facilitates operation of Data Platform. Tools which helps generates ERD diagrams typically relies on PK-FK relationships being defined – but of course more often than not this is not the case. Example? Here’s how you can Drag-drop tables in a Microsoft SQL Server onto a Canvas to create ERD – https://www.youtube.com/watch?v=BNx1TYItQn4
DataPlatform_DataflowMapping_Who

If you’re tasked to inherit or migrate such Data Platform, first order of business is to manually map out data flow. Why? To put in a fix, or enhancement, you’d first need to understand data flow before any work can commence.

And, that’s a very expensive, time consuming proposition.

There’re different ways to tackle the problem. Here’s one (Not-so-Smart) option:
• Manually review database queries and stored procedures
• Manually review application source code and extract from it embedded SQL statements

Adding to complexity,
• Dynamic SQL
• Object Relational Mapper (ORM)

The more practical approach would be to employ a SQL Profiler. Capture SQL Statements executed, and trace the flow manually. Even then, this typically requires experienced developers to get the job done (Which isn’t helping when you want to keep the cost down & delivery lead time as short as possible). As such undertaking is inherently risky – as you can’t really estimate how long it’ll take to map out the flow until you do.

There’s one command line utility MsSqlDataflowMapper (Free) from appliedalgo.com which can help. Basically, MsSqlDataflowMapper takes SQL Profiler trace file as input (xml), analyze captured SQL Statements. Look for INSERT’s and UPDATE’s. Then automatically dump data flow to a flow chart (HTML 5). Behind the scene, it uses SimpleFlowDiagramLib from Gridwizard to plot the flow chart – https://gridwizard.wordpress.com/2015/03/31/simpleflowdiagramlib-simple-c-library-to-serialize-graph-to-xml-and-vice-versa/

Limitation?
• Microsoft SQL Server only (To get around this, you can build your own tool capture SQL statements against Oracle/Sybase/MySQL…etc, analyze it, look up INSERT’s and UPDATE’s, then route result to SimpleFlowDiagramLib to plot the flow chart)
MsSqlDataflowMapper operates on table-level. It identify source/destination tables in process of mapping out the flow. However, it doesn’t provide field-level source information (a particular field in output table comes from which source tables?)
• The tool does NOT automatically *group* related tables into different Regions in diagram (This requires a lot more Intelligence in construction of the tool – as we all know, parsing SQL is actually a very complex task! https://gridwizard.wordpress.com/2014/11/08/looking-for-a-sql-parser-for-c-dotnet). At the end of the day, it still takes skilled developer to Make Sense of the flow.

Happy Coding!

SimpleFlowDiagramLib: Simple C# library to Serialize Graph to Xml (And Vice Versa)

In continuation from previous article https://gridwizard.wordpress.com/2015/03/25/simple-c-library-to-render-graph-to-flowchart/, we’d explore “SimpleFlowDiagramLib” capability to serialize Graph to Xml (And Vice versa). And, why do we want to do that? For example, wire down a graph to/from Web Services consumed by Java client for example.

Again,
Source code: https://github.com/gridwizard/SimpleFlowDiagram

using System;
using System.IO;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

using SimpleFlowDiagramLib;

namespace DemoSimpleFlowDiagramLib
{
    class Program
    {
        
static void Main(string[] args) { // STEP 1. Generate the nodes, render it to XML format (GraphXml) IList[Node] Nodes = new List(); GenerateNodes(Nodes, 3, 3); SimpleFlowDiagramGeneratorCompatibleGraphRender XmlConverter = new SimpleFlowDiagramGeneratorCompatibleGraphRender(); string GraphXml = XmlConverter.RenderGraph(Nodes); string GraphXmlFilePath = "GraphXml.xml"; System.IO.File.WriteAllText(GraphXmlFilePath, GraphXml); Console.WriteLine("Finished writing graph to XML format compatible with SimpleFlowDiagramGenerator.exe"); // STEP 2. Read back from GraphXml MemoryStream Stream = new MemoryStream(); StreamWriter writer = new StreamWriter(Stream); writer.Write(GraphXml); writer.Flush(); Stream.Position = 0; System.Xml.XmlReader XmlRdr = System.Xml.XmlReader.Create(Stream); IList[Node] ResurrectedNodes = XmlConverter.ReadGraphXml(XmlRdr); DiagramCanvasEngine.GenerateLayout( ResurrectedNodes, Node.DEFAULT_NODE_HEIGHT / 2, CanvasDefinition.LayoutDirection.LeftToRight ); // STEP 3. Render the nodes to HTML file - this is exactly what "SimpleFlowDiagramGenerator.exe" does. It reads input xml file which defines the nodes. Then render flowchart to HTML file. IGraphRender Html5Render = new Html5GraphRender(); Html5Render.RenderGraph(Canvas, ResurrectedNodes, DisplaySettings, "Flowchart.html"); Console.WriteLine("Finished render to HTML5 to Flowchart.html"); return; }
public static void GenerateNodes( IList Nodes, int NumRootNodes, int MaxTreeDepth ) { Node RootNode; for (int i = 0; i < NumRootNodes; i++) { RootNode = new Node(); RootNode.NodeHeader = "Root_" + i; RootNode.NodeDetail = "Some detail ..."; RootNode.NodeHyperLink = "http://somewhere.com"; RootNode.Depth = 0; Nodes.Add(RootNode); GenerateSingleGraph(Nodes, RootNode, MaxTreeDepth); } return; } public static void GenerateSingleGraph( IList Nodes, Node RootNode, int MaxTreeDepth ) { int CurrentDepth = 0; RecursiveGenerateGraph(Nodes, RootNode, MaxTreeDepth, ref CurrentDepth); return; } public static void RecursiveGenerateGraph( IList Nodes, Node Node, int MaxTreeDepth, ref int CurrentDepth ) { CurrentDepth++; Random rnd = new Random(DateTime.Now.Second); if (CurrentDepth < MaxTreeDepth) { int NumChildren = rnd.Next(5); for (int i = 0; i < NumChildren; i++) { Node Child = new Node(); Child.NodeHeader = Node.NodeHeader + "." + "Child_Level" + CurrentDepth + "_Num" + i; Child.NodeDetail = "Some detail ..."; Child.NodeHyperLink = "http://somewhere.com"; Child.Depth = CurrentDepth; Child.ParentNodes.Add(Node); Node.ChildNodes.Add(Child); Nodes.Add(Child); int CopyCurrentDepeth = CurrentDepth; RecursiveGenerateGraph(Nodes, Child, MaxTreeDepth, ref CopyCurrentDepeth); } } return; } } }

Happy Coding!

You may also want to check out how to convert DataTable to/from HTML Table – https://gridwizard.wordpress.com/2014/12/17/datatable-to-from-html-table

Simple C# Library to render graph to Flowchart

Simple C# Library to render graph to Flowchart – currently, only render to HTML5 (Intention to support Visio in future).

You can render your graph horizontally (Left to Right), or vertically (Top down) – This is, however, Device Independent, and agnostic of whether you want to render to HTML5, Winform, WPF… The library automatically center parent nodes and calculate Node.x/y and overall canvas size (In case if you want to render it to surfaces other than HTML5 – for example, Visio, WPF, Winform…etc).

Top-to-Bottom
SimpleFlowDiagramLib.Demo.TopToBottom

Note that we didn’t scale text to fit the boxes – this is because automatic scaling would make text so small you can’t read it, thereby making things even worse. This said, if it’s absolutely necessary, Also, notice parent nodes are horizontally-center aligned.

Left-to-Right
SimpleFlowDiagramLib.Demo.LeftToRight

Parent nodes are vertically-center aligned

Source code: https://github.com/gridwizard/SimpleFlowDiagram

Usage:
Can’t be simpler to use, bulk of code in bottom create dummy data for illustration purpose.
a. Node.x/y calculated after call to DiagramCanvasEngine.GenerateLayout (You can use “Nodes” to render on other non-HTML5 surfaces)
b. Html5Render.RenderGraph renders to HTML5

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

using SimpleFlowDiagramLib;

namespace DemoSimpleFlowDiagramLib
{
    class Program
    {
        
static void Main(string[] args) { IList Nodes = new List(); GenerateNodes(Nodes, 3, 3); Console.WriteLine("Finished generating dummy nodes, # Nodes: " + Nodes.Count); // Node.x/Node.y + Canvas size calculated (You can adjust programmatically as you see fit afterwards) CanvasDefinition Canvas = DiagramCanvasEngine.GenerateLayout( Nodes, Node.DEFAULT_NODE_HEIGHT / 2, CanvasDefinition.LayoutDirection.LeftToRight ); Console.WriteLine("Finished calculating layout"); GraphDisplayFormatSettings DisplaySettings = new GraphDisplayFormatSettings(); // You can override display font, fore/back color ...etc DisplaySettings.NodeHeaderSettings.ForeColorName = "Black"; DisplaySettings.NodeDetailSettings.ForeColorName = "Black"; IGraphRender Html5Render = new Html5GraphRender(); Html5Render.RenderGraph(Canvas, Nodes, DisplaySettings, "Flowchart.html"); Console.WriteLine("Finished render to HTML5"); return; }
public static void GenerateNodes( IList Nodes, int NumRootNodes, int MaxTreeDepth ) { Node RootNode; for (int i = 0; i < NumRootNodes; i++) { RootNode = new Node(); RootNode.NodeHeader = "Root_" + i; RootNode.NodeDetail = "Some detail ..."; Nodes.Add(RootNode); GenerateSingleGraph(Nodes, RootNode, MaxTreeDepth); } return; } public static void GenerateSingleGraph( IList Nodes, Node RootNode, int MaxTreeDepth ) { int CurrentDepth = 0; RecursiveGenerateGraph(Nodes, RootNode, MaxTreeDepth, ref CurrentDepth); return; } public static void RecursiveGenerateGraph( IList Nodes, Node Node, int MaxTreeDepth, ref int CurrentDepth ) { CurrentDepth++; Random rnd = new Random(DateTime.Now.Second); if (CurrentDepth < MaxTreeDepth) { int NumChildren = rnd.Next(5); for (int i = 0; i < NumChildren; i++) { Node Child = new Node(); Child.NodeHeader = Node.NodeHeader + "." + "Child_Level" + CurrentDepth + "_Num" + i; Child.NodeDetail = "Some detail ..."; Child.ParentNodes.Add(Node); Node.ChildNodes.Add(Child); Nodes.Add(Child); int CopyCurrentDepeth = CurrentDepth; RecursiveGenerateGraph(Nodes, Child, MaxTreeDepth, ref CopyCurrentDepeth); } } return; } } }

Happy Coding!

Next, SimpleFlowDiagramLib – LIBRARY TO SERIALIZE GRAPH TO XML (AND VICE VERSA) – https://gridwizard.wordpress.com/2015/03/31/simpleflowdiagramlib-simple-c-library-to-serialize-graph-to-xml-and-vice-versa

Parsing Microsoft SQL Profiler Trace XML using *DynamicXmlStream*

This article will show how compact syntax is extracting SQL statements from SQL Profiler Trace (Microsoft SQL Server) using Mahesh DyanmicXmlStream – which is *dynamic* (http://www.codeproject.com/Articles/436406/Power-of-Dynamic-Reading-XML-and-CSV-files-made-ea)

And the code to parse it (Don’t get more compact than this!)

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

using LearningMahesh.DynamicIOStream;
using LearningMahesh.DynamicIOStream.Xml;

static void Main(string[] args)
        {
            string TraceFileName = null;
            string SQLStatement = null;
            IList SQLStatements = null;

                if (args != null)
                {
                    TraceFileName = args[0];
                }

                #region STEP 1. Extract SQL from profiler trace
                dynamic profilerReader = DynamicXmlStream.Load(new FileStream(TraceFileName, FileMode.Open));

                SQLStatements = new List();

                foreach (
                     dynamic Event in 
                        (profilerReader.TraceData.Events.Event 
                            as DynamicXmlStream).AsDynamicEnumerable()
                            .Where(Event => Event.name.Value =="SQL:BatchStarting")
                    )
                {
                    foreach (dynamic Column in 
                        (Event.Column as DynamicXmlStream).AsDynamicEnumerable()
                        .Where(Column => Column.name.Value == "TextData")
                        )
                    {
                        SQLStatement = Column.Value;
                        SQLStatements.Add(SQLStatement);
                        Console.WriteLine(SQLStatement);
                    }
                }
                
            return;
        }

Java and dotnet Interop

This article is about Java-dotnet Interop. We’ll explore what options we have for different scenario where interop is required.

First, when we say “Java-dotnet Interop”, there are two possibilities:

1. Java -to- dotnet communications

2. dotnet -to-Java communications

Secondly, we assume, if you’re developing in Java, you’d run it on Linux (Or simply put, if your application written in Java, why would it run on Windows?)

Given above, what are our options?

 

1. Socket

Anand Manikiam has written a piece on this subject, http://www.codeproject.com/Articles/11602/Java-and-Net-interop-using-Sockets

The pros for this approach are:

a. No middle-ware

b. Fast

The cons are:

a. Resiliency

b. Casting complex object/classes from byte[]?

c. Message security? Encryption? Anti-tampering? DOS? If not implemented this be Intranet application only.

 

2. Web Services

I’ve written an article of consuming Java-ws from dotnet:

https://gridwizard.wordpress.com/2014/12/26/java-ws-and-dotnet-interop-example/

You will also find plenty of discussions on consuming WCF-from-Java:

http://www.codeproject.com/Articles/777036/Consuming-WCF-Service-in-Java-Client

The pros for this approach are:

a. No middle-ware

b. Higher level of compatibility with code coded in more languages (C++/SOAP, Python, R …etc)

The cons are:

a. Less fast than socket

b. Resiliency

c. Message security? Encryption? Anti-tampering? DOS? If not implemented this be Intranet application only.

d. Slower than Socket! (Web Services overhead)

 

3. Message Bus

RabbitMQ (http://www.rabbitmq.com) is all about Messaging. If you’re developing real time applications, RabbitMQ offers high performance battle tested communication platforms and it as an API for just about any language on the planet. C++, dotnet, Java, Perl, Python…

Pros are:

a. Resiliency – producers and consumers can die and crash at any moment.

b. Performance

cons:

a. You need install Middleware, and if you’re a software vendor, you’d need bundle installation of RabbitMQ with your application

 

4. Commercial Tools

Depending on what you’re building, if what you’re trying to build is a computing grid, then there are commercial tools which allows you to run jobs on basically any platform, coded in any language.

Appliedalgo.com for instances supports:

a. Scheduling, conditional job chaining and Workload Automation

b. Grid Computing – nodes/slaves on any platform/language

c. Automatic persistence of run history, parameters, input and results

(Even configure cell level validations by “IsNumber”, or use of user specified Regular Expression)

d. GUI for you to track run parameters, input and results

However, such tools inevitably introduces execution overhead. So depending on whether you’re …

a. Executing high number of light weight jobs –> Probably should not use any tool besides a Message bus such as RabbitMQ

b. Executing medium number of medium weight jobs –> Best application of Workload Automation Data Platforms such as Appliedalgo.com

c. Executing low number of heavy weight jobs –> Best custom coded, persistence via BCP (There’s no other way for million rows or #bigdata processing)

 
But this would not be a viable option for instance if you’re building a hotel booking system with web tier built in ASP.NET and backend in Java with Java-ws

Happy Coding!