Hadoop (http://hadoop.apache.org) has gained a lot of popularity in recent years – and have claimed to throne in Grid Computing. There’s has been a lot of confusion what’s meant by Grid, Load Balancing, Big Data, Cloud…etc. There’re grids that’s geared towards persisting, parsing and analyzing non-relational data (Social media, web scrapping …etc) – Hadoop is one such example. There’re software vendors that cater for simple Enterprise workflow (i.e. Scheduling, Job Chaining) – BMC Control-M, Schedulix for instance. There’re also data platform that’s geared towards Numerical and Quantitative Analysis (Data in relational format) – Applied Algo ETL Suite. How do we decide what’s suitable for what purpose?
What is Big Data?
View original post 1,451 more words