• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

Deep Mukhopadhyay, Ph.D.

  • Home
  • Blog
  • Research
  • Publications
  • Software
  • Teaching
    • Courses
  • Talks

Data Scientist and Data Mechanic

April 4, 2016 by deepstatorg

Netflix competition. As the datasets are getting BIG and COMPLEX, the most difficult challenge for Statistical Scientist is to figure out “Where is the information hidden.”  It’s an interactive process of investigation rather than a passive application of algorithms and calculating error rates. Two critical skills:  (1)  “look at the data”, which is missing in the mechanical push the button culture; and (2)  learn “how to question the data”, rather than only answering a specific question.  They allow data scientists to discover the unexpected in addition to the usual verification of the expected. This begs the question whether

  • the Data Science training curriculum should look like a long manual of specialized methods and (series of cookbook) algorithms;
  • or, should train students (and industry professionals) in the Scientific Data Exploration (Sci-Dx) — A systematic and pragmatic approach to data modeling addressing the “Monkey and banana problem” [Pigeon’s approach] for practitioners. [I believe Wolfgang Kohler‘s “insight learning” idea can guide us to  develop such a curriculum.]
The first path will produce DataRobots, not Data Scientists. The later goal looks out of reach unless we figure out how to design the “LEGO Bricks” of Statistical Science (fundamental building blocks of Statistical learning), which help to understand disparate Statistical procedures from a common perspective (thus reduces the size of the manual) and can be appropriately combined to build versatile data products brick by brick.    ]]>

Share this post: on Twitter on Facebook on LinkedIn

Filed Under: Blog Tagged With: Data Mechanic, Data Science, Data Scientist, Kaggle Syndrome

Primary Sidebar

Deep Mukhopadhyay

Deep Mukhopadhyay
Statistics Department
deep [at] unitedstatalgo.com

EDUCATION

  • Ph.D. (2013), Texas A&M University
  • M.S. (2008), Indian Institute of Technology (IIT), Kanpur
  • B.S. (2006), University of Calcutta, India

Footer

Follow Us

  • LinkedIn
  • Twitter
  • Skype

Contact Us

  • Email
    deep@unitedstatalgo.com
  • Address
    Department of Statistics
    Sequoia Hall, 390 Serra Mall
    Stanford, CA 94305

Read Recent Blogs

  • Could Einstein’s Work Get Published Today?
  • What's The Point of Doing Fundamental Science?
  • Two sides of Theoretical Data Science: Analysis and Synthesis

Copyright © 2025 · eleven40 Pro on Genesis Framework · WordPress · Log in