ETL

Java and dotnet Interop

This article is about Java-dotnet Interop. We’ll explore what options we have for different scenario where interop is required.

First, when we say “Java-dotnet Interop”, there are two possibilities:

1. Java -to- dotnet communications

2. dotnet -to-Java communications

Secondly, we assume, if you’re developing in Java, you’d run it on Linux (Or simply put, if your application written in Java, why would it run on Windows?)

Given above, what are our options?

 

1. Socket

Anand Manikiam has written a piece on this subject, http://www.codeproject.com/Articles/11602/Java-and-Net-interop-using-Sockets

The pros for this approach are:

a. No middle-ware

b. Fast

The cons are:

a. Resiliency

b. Casting complex object/classes from byte[]?

c. Message security? Encryption? Anti-tampering? DOS? If not implemented this be Intranet application only.

 

2. Web Services

I’ve written an article of consuming Java-ws from dotnet:

https://gridwizard.wordpress.com/2014/12/26/java-ws-and-dotnet-interop-example/

You will also find plenty of discussions on consuming WCF-from-Java:

http://www.codeproject.com/Articles/777036/Consuming-WCF-Service-in-Java-Client

The pros for this approach are:

a. No middle-ware

b. Higher level of compatibility with code coded in more languages (C++/SOAP, Python, R …etc)

The cons are:

a. Less fast than socket

b. Resiliency

c. Message security? Encryption? Anti-tampering? DOS? If not implemented this be Intranet application only.

d. Slower than Socket! (Web Services overhead)

 

3. Message Bus

RabbitMQ (http://www.rabbitmq.com) is all about Messaging. If you’re developing real time applications, RabbitMQ offers high performance battle tested communication platforms and it as an API for just about any language on the planet. C++, dotnet, Java, Perl, Python…

Pros are:

a. Resiliency – producers and consumers can die and crash at any moment.

b. Performance

cons:

a. You need install Middleware, and if you’re a software vendor, you’d need bundle installation of RabbitMQ with your application

 

4. Commercial Tools

Depending on what you’re building, if what you’re trying to build is a computing grid, then there are commercial tools which allows you to run jobs on basically any platform, coded in any language.

Appliedalgo.com for instances supports:

a. Scheduling, conditional job chaining and Workload Automation

b. Grid Computing – nodes/slaves on any platform/language

c. Automatic persistence of run history, parameters, input and results

(Even configure cell level validations by “IsNumber”, or use of user specified Regular Expression)

d. GUI for you to track run parameters, input and results

However, such tools inevitably introduces execution overhead. So depending on whether you’re …

a. Executing high number of light weight jobs –> Probably should not use any tool besides a Message bus such as RabbitMQ

b. Executing medium number of medium weight jobs –> Best application of Workload Automation Data Platforms such as Appliedalgo.com

c. Executing low number of heavy weight jobs –> Best custom coded, persistence via BCP (There’s no other way for million rows or #bigdata processing)

 
But this would not be a viable option for instance if you’re building a hotel booking system with web tier built in ASP.NET and backend in Java with Java-ws

Happy Coding!

 

 

Advertisements

TSQL – Utility to read csv and Excel files

/*
If you see this error?
The 32-bit OLE DB provider “Microsoft.ACE.OLEDB.12.0″ cannot be loaded in-process on a 64-bit SQL Server.

To resolve this error you need to get the “Access Database Engine 2010″ to facilitate the transfer of data between existing Microsoft Office files such as “Microsoft Office Access 2010″ (*.mdb and *.accdb) files and “Microsoft Office Excel 2010″ (*.xls, *.xlsx, and *.xlsb) files to other data sources such as “Microsoft SQL Server”.
Get the Microsoft Access Database Engine 2010 Redistributable [http://www.microsoft.com/en-us/download/details.aspx?id=13255]

http://sqlwithmanoj.com/tag/microsoft-ace-oledb-12-0/
*/
IF EXISTS (SELECT * FROM sys.objects WHERE type = ‘P’ AND name = ‘sp_ReadCSVFile’)
DROP PROCEDURE sp_ReadCSVFile
GO

IF EXISTS (SELECT * FROM sys.objects WHERE type = ‘P’ AND name = ‘sp_ReadExcelFile’)
DROP PROCEDURE sp_ReadExcelFile
GO

IF EXISTS (
SELECT * FROM sysobjects WHERE id = object_id(N’fn_SplitString’)
AND xtype IN (N’FN’, N’IF’, N’TF’)
)
DROP FUNCTION fn_SplitString
GO

/*

Remember to put a file Schema.ini in same folder as csv file. Also ‘@TargetTable’ parameter should be csv file name (but without Extension)

exec sp_ReadCSVFile ‘C:\tmp’, ‘EmpData’

[EmpData.csv]
FORMAT=Delimited(;)
ColNameHeader=True
MaxScanRows=0
CharacterSet=ANSI
TextDelimiter=`
DecimalSymbol=.
COL1=EmpId TEXT
COL2=FirstName TEXT
COL3=LastName TEXT
COL4=MiddleName TEXT
COL5=EmailAddr TEXT
*/
CREATE PROCEDURE sp_ReadCSVFile
(
@TargetFolder varchar(255),
@TargetTable varchar(255)
)
AS
BEGIN
declare @SqlStmt nvarchar(max)

set @SqlStmt = ‘IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N”[dbo].[‘ + @TargetTable + ‘]”) AND type in (N”U”))’
set @SqlStmt = @SqlStmt + ‘ TRUNCATE TABLE ‘ + @TargetTable
PRINT @SqlStmt
exec sp_executesql @SqlStmt

set @SqlStmt = ‘ SELECT * INTO ‘ + @TargetTable
+ ‘ FROM OpenDataSource (”Microsoft.ACE.OLEDB.12.0”, ”Data Source=”‘ + @TargetFolder + ‘”;Extended properties=Text”)…[‘
+ @TargetTable + ‘#csv]’
PRINT @SqlStmt
exec sp_executesql @SqlStmt
END
GO

/*
Usage:
exec sp_ReadExcelFile ‘C:\tmp’, ‘EmpData’, ‘EmpData.xls’, ‘[Sheet1$]’, ‘Col1,Col2,Col3,Col4,Col5’
*/
CREATE PROCEDURE sp_ReadExcelFile
(
@TargetFolder varchar(255), /* For example, ‘C:\tmp’ */
@TargetTable varchar(255), /* For example, ‘EmpData’ */
@ExcelFile varchar(255), /* For example, ‘EmpData.xls’ */
@ExcelSheet varchar(255), /* For example, ‘[Sheet1$]’ */
@ExcelFields varchar(8000) /* Comma separate list, for example: ‘Col1,Col2,Col3,Col4,Col5’ */
)
AS
BEGIN
declare @SqlStmt nvarchar(max)
declare @FirstColumn nvarchar(255)

set @SqlStmt = ‘IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N”[dbo].[‘ + @TargetTable + ‘]”) AND type in (N”U”))’
set @SqlStmt = @SqlStmt + ‘ DROP TABLE ‘ + @TargetTable
PRINT @SqlStmt
exec sp_executesql @SqlStmt

select top 1 @FirstColumn=[part] from fn_SplitString(@ExcelFields, ‘,’)
print ‘FirstColumn: ‘ + @FirstColumn

set @SqlStmt = ‘SELECT ‘ + @ExcelFields + ‘ INTO ‘ + @TargetTable + ‘ FROM OPENROWSET(”Microsoft.ACE.OLEDB.12.0”,
”Excel 12.0;DATABASE=’ + @TargetFolder + ‘\’ + @ExcelFile + ‘;IMEX=1”, ”Select * from ‘ + @ExcelSheet + ”’)’
+ ‘WHERE NOT ‘ + @FirstColumn + ‘ IS NULL’
PRINT @SqlStmt
exec sp_executesql @SqlStmt
END
GO

/*
Taken from: http://stackoverflow.com/questions/697519/split-function-equivalent-in-t-sql
Usage: select top 1 part from fn_SplitString(‘aaa,bbb,ccc,ddd,eee’,’,’)
*/
CREATE FUNCTION [dbo].[fn_SplitString]
(
@sString nvarchar(2048),
@cDelimiter nchar(1)
)
RETURNS @tParts TABLE ( part nvarchar(2048) )
AS
BEGIN
if @sString is null return
declare @iStart int,
@iPos int
if substring( @sString, 1, 1 ) = @cDelimiter
begin
set @iStart = 2
insert into @tParts
values( null )
end
else
set @iStart = 1
while 1=1
begin
set @iPos = charindex( @cDelimiter, @sString, @iStart )
if @iPos = 0
set @iPos = len( @sString )+1
if @iPos – @iStart > 0
insert into @tParts
values ( substring( @sString, @iStart, @iPos-@iStart ))
else
insert into @tParts
values( null )
set @iStart = @iPos+1
if @iStart > len( @sString )
break
end
RETURN
END
GO

DataTable to-from HTML Table

Here’s a small handy utility which converts a C# DataTable to/from HTML Table. Happy Coding!

using System;
using System.Collections.Generic;
using System.Data;
using System.Globalization;
using System.Text;
using System.Xml;

namespace ClientUtil
{
public class DataTableUtil
{

public static string DataTableToXmlString(DataTable dtData)
{
if (dtData == null || dtData.Columns.Count == 0)
return (string) null;
DataColumn[] primaryKey = dtData.PrimaryKey;
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.Append(“<TABLE>”);
stringBuilder.Append(“<TR>”);
foreach (DataColumn dataColumn in (InternalDataCollectionBase) dtData.Columns)
{
if (DataTableUtil.IsPrimaryKey(dataColumn.ColumnName, primaryKey))
stringBuilder.Append(“<TH IsPK=’true’ ColType='”).Append(Convert.ToString(dataColumn.DataType == typeof (object) ? (object) typeof (string) : (object) dataColumn.DataType)).Append(“‘>”).Append(dataColumn.ColumnName.Replace(“&”, “”)).Append(“</TH>”);
else
stringBuilder.Append(“<TH IsPK=’false’ ColType='”).Append(Convert.ToString(dataColumn.DataType == typeof (object) ? (object) typeof (string) : (object) dataColumn.DataType)).Append(“‘>”).Append(dataColumn.ColumnName.Replace(“&”, “”)).Append(“</TH>”);
}
stringBuilder.Append(“</TR>”);
int num1 = 0;
foreach (DataRow dataRow in (InternalDataCollectionBase) dtData.Rows)
{
stringBuilder.Append(“<TR>”);
int num2 = 0;
foreach (DataColumn dataColumn in (InternalDataCollectionBase) dtData.Columns)
{
string str = Convert.IsDBNull(dataRow[dataColumn.ColumnName]) ? (string) null : Convert.ToString(dataRow[dataColumn.ColumnName]).Replace(“<“, “&lt;”).Replace(“>”, “&gt;”).Replace(“\””, “&quot;”).Replace(“‘”, “&apos;”).Replace(“&”, “&amp;”);
if (!string.IsNullOrEmpty(str))
stringBuilder.Append(“<TD>”).Append(str).Append(“</TD>”);
else
stringBuilder.Append(“<TD>”).Append(“</TD>”);
++num2;
}
stringBuilder.Append(“</TR>”);
++num1;
}
stringBuilder.Append(“</TABLE>”);
return ((object) stringBuilder).ToString();
}

protected static bool IsPrimaryKey(string ColumnName, DataColumn[] PKs)
{
if (PKs == null || string.IsNullOrEmpty(ColumnName))
return false;
foreach (DataColumn dataColumn in PKs)
{
if (dataColumn.ColumnName.ToLower().Trim() == ColumnName.ToLower().Trim())
return true;
}
return false;
}

public static DataTable XmlStringToDataTable(string XmlData)
{
DataTable dataTable = (DataTable) null;
IList<DataColumn> list = (IList<DataColumn>) new List<DataColumn>();
if (string.IsNullOrEmpty(XmlData))
return (DataTable) null;
XmlDocument xmlDocument1 = new XmlDocument();
xmlDocument1.PreserveWhitespace = true;
XmlDocument xmlDocument2 = xmlDocument1;
xmlDocument2.LoadXml(XmlData);
XmlNode xmlNode1 = xmlDocument2.SelectSingleNode(“/TABLE”);
if (xmlNode1 != null)
{
dataTable = new DataTable();
int num = 0;
foreach (XmlNode xmlNode2 in xmlNode1.SelectNodes(“TR”))
{
if (num == 0)
{
foreach (XmlNode xmlNode3 in xmlNode2.SelectNodes(“TH”))
{
bool result = false;
string str = xmlNode3.Attributes[“IsPK”].Value;
if (!string.IsNullOrEmpty(str))
{
if (!bool.TryParse(str, out result))
result = false;
}
else
result = false;
Type type = Type.GetType(xmlNode3.Attributes[“ColType”].Value);
DataColumn column = new DataColumn(xmlNode3.InnerText, type);
if (result)
list.Add(column);
if (!dataTable.Columns.Contains(column.ColumnName))
dataTable.Columns.Add(column);
}
if (list.Count > 0)
{
DataColumn[] dataColumnArray = new DataColumn[list.Count];
for (int index = 0; index < list.Count; ++index)
dataColumnArray[index] = list[index];
dataTable.PrimaryKey = dataColumnArray;
}
}
else
{
DataRow row = dataTable.NewRow();
int index = 0;
foreach (XmlNode xmlNode3 in xmlNode2.SelectNodes(“TD”))
{
Type dataType = dataTable.Columns[index].DataType;
string s = xmlNode3.InnerText;
if (!string.IsNullOrEmpty(s))
{
try
{
s = s.Replace(“&lt;”, “<“);
s = s.Replace(“&gt;”, “>”);
s = s.Replace(“&quot;”, “\””);
s = s.Replace(“&apos;”, “‘”);
s = s.Replace(“&amp;”, “&”);
row[index] = Convert.ChangeType((object) s, dataType);
}
catch
{
if (dataType == typeof (DateTime))
row[index] = (object) DateTime.ParseExact(s, “yyyyMMdd”, (IFormatProvider) CultureInfo.InvariantCulture);
}
}
else
row[index] = Convert.DBNull;
++index;
}
dataTable.Rows.Add(row);
}
++num;
}
}
return dataTable;
}
}
}

That’s it. You may also want to check out how to convert DataTable to JSON string?
http://www.codeproject.com/Articles/777284/Consuming-JSON-ASMX-Web-Services-with-AngularJS-A

Taken’s from Mohamed Kamal Pharm’s article – first you convert DataTable.Rows into a List, the use System.Web.Script.Serialization.JavaScriptSerializer to render the list to a string:

public String ConvertDataTableTojSonString(DataTable dataTable)
{
System.Web.Script.Serialization.JavaScriptSerializer serializer =
new System.Web.Script.Serialization.JavaScriptSerializer();

List<Dictionary<String, Object>> tableRows = new List<Dictionary<String, Object>>();

Dictionary<String, Object> row;

foreach (DataRow dr in dataTable.Rows)
{
row = new Dictionary<String, Object>();
foreach (DataColumn col in dataTable.Columns)
{
row.Add(col.ColumnName, dr[col]);
}
tableRows.Add(row);
}
return serializer.Serialize(tableRows);
}

You may also want to checkout how to serialize a graph to/from xml! https://gridwizard.wordpress.com/2015/03/31/simpleflowdiagramlib-simple-c-library-to-serialize-graph-to-xml-and-vice-versa