Thoughts on Performance Optimization at Scale

by awahid on June 11, 2015 in blog

I came across Alex Levenson presentation and it made me feel that we might be living in the fools heaven. The challenges of Big Data is not that simple to address. Every scenario is different, which makes it more difficult for the developers to provide a generic solution for Big Data challenges. The presentation is worth listening and you might… Read more →

How Big Data Can Transform Healthcare

by awahid on May 10, 2015 in blog

Big Data is mostly thought of evolving the businesses or creating smart cities, however, Big Data can also be used in Healthcare sectors. According to McKinsey & Company identified four main sources of Big Data in Healthcare industry. The four sources are: Activity (claims) and cost data, Clinical data, Pharmaceutical R&D data and Patient behavior and sentiment data. I recently came across very informative article which talks… Read more →

BigData and DataScience Skills worth more than $100000 Salary

by awahid on March 22, 2015 in blog

How many of you have seen recently published 2015 Salary Survey by Dice.com? The survey consist of responses from 23,470 IT professionals in the fall of 2014 and a list of highly paying technical skills. I am sure after viewing the list you might want to learn some of those skills. Big data is worth $116,414 with nearly 35,000 job listings and Data Science is… Read more →

What is Wrong With All Machine Learning Models

by awahid on March 10, 2015 in blog

John Langford a machine learning research scientist, works in Microsoft and author of the weblog hunch.net, has recently published a brilliant article about flaws in machine learning models. Currently the link to his original article is down, but you can find his article as below. John Article (Taken from here) Attempts to abstract and study machine learning are within some given… Read more →

Machine Learning is a new form of statistics

by awahid on March 1, 2015 in blog

Statistics and machine learning are thought to be two separate fields. But if you read good articles from highly reputed journals of machine learning you will realize that these two fields are merging together. Not too long ago, a new field “statistical machine learning” made it clear that these two field have too much in common. Coming from computer science background, I… Read more →

Data science related top 20 short tutorials (must read)

by awahid on February 3, 2015 in blog

I have finished reading 20 short tutorial suggested by datasciencecentral. Its amazing, I particularly liked clustering and bigdata related articles. Following is the complete list, go ahead and let me know what’s your favourite article. Tutorial: How to detect spurious correlations, and how to find the … Practical illustration of Map-Reduce (Hadoop-style), on real data Jackknife logistic and linear regression for… Read more →

Datascience explained in form of a poster

by awahid on January 25, 2015 in blog

ICRIS (http://www.icris.nl) made a simple poster to describe fundamentals of data science. Click on the following image to see the poster in high resolution. Read more →

Basics of Bigdata

by awahid on January 22, 2015 in blog • 0 Comments

Bigdata is often misunderstood and thought to be very large data, however it is just one aspect of bigdata. The term Bigdata refers to data, which is too complex for traditional approaches to handle. The bigdata have following characteristics. Volume – Large amount of the data. Velocity – Rapid generation of the data. Variability – Inconsistency of the data. Veracity – Quality of… Read more →

Weka or LingPipe for New Data Scientist

by awahid on January 11, 2015 in blog • 0 Comments

I started working in Weka and Lingpipe around 2 years ago. My task was to develop a better clustering algorithm for text data. I initially used Weka to familiarize my self with basic clustering algorithms, however I found Weka has more documentation for classification algorithms than clustering algorithms. I came across Lingpipe framework on the internet and found that their blog provides… Read more →

Clustering Bigdata

by awahid on January 4, 2015 in blog • 0 Comments

Clustering large amount of data brings complexity and requires special clustering algorithms. Common clustering algorithms like k-means are not designed to handle such tasks. Anil K. Jain, A big name in domain of clustering algorithms explains this phenomena in his video lecture (http://videolectures.net/single_jain_bigdata/). He provides a solution “approximate k-means algorithm” which cluster large amount of data (bigdata). Other researcher like Xiao Cai et.… Read more →

Abdul Wahid

Providing AI/Machine Learning/Data Science/BigData Solutions