tag:blogger.com,1999:blog-7994087232040033267.post1394175011370671044..comments2024-03-03T00:23:26.457-08:00Comments on Pragmatic Programming Techniques: BI at large scaleRicky Hohttp://www.blogger.com/profile/03793674536997651667noreply@blogger.comBlogger4125tag:blogger.com,1999:blog-7994087232040033267.post-87968643320367789242010-12-08T00:28:14.877-08:002010-12-08T00:28:14.877-08:00Ricky,
I got your point. Diversity is good. But i...Ricky,<br /><br />I got your point. Diversity is good. But is it good for a Developer as well as a Researcher? Is it inevitable?<br /><br />For now it seems like the world had split in two: 1) Small Data vs. Big Data, 2) Declarative/Imperative Programming vs. Functional Programming, 3) Single Machine vs. Cluster, 4) Relational (Schema-wise) vs. NoSQL. <br /><br />I mean, right now, it is black and white! And if you're not an expert in both, you can never pick the right solution without countless trials. So a lot of time is wasted on the tool, instead of concentrating on a problem. (Take MapReduce programming, for example. Or schema-less BigTable approach.)<br /><br />Long ago I switched to Java from C/C++. I did so because I was sick and tired by dealing with the programming language mess... And so many people just keep on doing it!!! I would never go back.<br /><br />That's what I'm eager for in, what I call, BI/BL gap: arrival of some core common "THING". (The gray matter, if you want, as opposed to the B/W today. ;-) )<br /><br />What's you opinion: might it be a common programming language to fit both? Other ideas?Anonymoushttps://www.blogger.com/profile/17908351268608842239noreply@blogger.comtag:blogger.com,1999:blog-7994087232040033267.post-5849104479549988492010-12-07T19:19:22.369-08:002010-12-07T19:19:22.369-08:00Regarding infrastructure, I don't think one si...Regarding infrastructure, I don't think one size will fit all. I am leaning towards a combination of Map/Reduce (to batch-process data at large scale) and NoSQL (to allow data to be retrieved at real-time). I also think CEP technique should be part of it as well.<br /><br />Should we do "big data with simple processing", or "small data (sampled) with sophisticated logic ? I think this will be case by case basis. But I would like to have a wide spectrum of solution so I can pick the most optimal point.<br /><br />I am a big fan of "approximation algorithm" where the user can tune the tradeoff between accuracy and workload capacity.<br /><br />As Michael point out, how to correctly do the sample is the key to success.Ricky Hohttps://www.blogger.com/profile/03793674536997651667noreply@blogger.comtag:blogger.com,1999:blog-7994087232040033267.post-46292545672538986762010-12-07T14:11:57.264-08:002010-12-07T14:11:57.264-08:00Ricky:
You echo the sentiment I have heard from s...Ricky:<br /><br />You echo the sentiment I have heard from statisticians, that big data can become an endeavor unto itself which wears out the team before real insights are gained.<br /><br />Stratified sampling is definitely the way to go, although it needs to be implemented correctly.<br /><br />If more decision makers would appropriately grasp statistical significance and the magnitude of the effect, we could spend more time analyzing data and less time pushing it around the block.<br /><br />Michael D. Healy<br />http://michaeldhealy.com/<br />http://twitter.com/michaeldhealyMichael D. Healyhttp://michaeldhealy.com/noreply@blogger.comtag:blogger.com,1999:blog-7994087232040033267.post-23306024793493044642010-12-07T04:48:39.233-08:002010-12-07T04:48:39.233-08:00Ricky, hi!
What do you think: are the infrastruct...Ricky, hi!<br /><br />What do you think: are the infrastructure requirements, and solutions thereof, are fundamentally different between:<br />a) the business intelligence (BI) tasks [computation-/storage- intensive, math-rich] and<br />b) the business logic (BL) tasks [user IO responsive, data-rich]?<br /><br />You seem to be familiar with both (as I do also), and I wondered your opinion...Anonymoushttps://www.blogger.com/profile/17908351268608842239noreply@blogger.com