• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

Deep Mukhopadhyay, Ph.D.

  • Home
  • Blog
  • Research
  • Publications
  • Software
  • Teaching
    • Courses
  • Talks

Next-Generation Statisticians

Confirmatory Culture: Time To Reform or Conform?

November 1, 2016 by deepstatorg

THEORY

Culture 1: Algorithm + Theory: the role of theory is to justify or confirm. Culture 2: Theory + Algorithm: From confirmatory to constructive theory, explaining the statistical origin of the algorithm(s)–an explanation of where they came from. Culture 2 views “Algorithms” as the derived product, not the fundamental starting point [this point of view separates statistical science from machine learning].

PRACTICE 

Culture 1: Science + Data: Job of a Statistician is to confirm scientific guesses. Thus, happily play in everyone’s backyard as a confirmatist. Culture 2: Data + Science: Exploratory nonparametric attitude. Plays in the front-yard as the key player in order to guide scientists to ask the “right question”.

TEACHING 

Culture 1: It proceeds in the following sequences: for (i in 1:B) { Teach Algorithm-i; Teach Inference-i; Teach Computation-i } By construction, it requires extensive bookkeeping and memorization of a long list of disconnected algorithms. Culture 2: The pedagogical efforts emphasize the underlying fundamental principles and statistical logic whose consequences are algorithms. This “short-cut” approach substantially accelerates the learning by making it less mechanical and intimidating. Should we continue to conform to the confirmatory culture or It’s time to reform? The choice is ours and the consequences are ours as well.]]>

Filed Under: Blog Tagged With: 21st-century statistics, Data Science, Next-Generation Statisticians, Science of Statistics

The Scientific Core of Data Analysis

November 26, 2015 by deepstatorg

Richard Courant‘s view: “However, the difficulty that challenges the inventive skill of the applied mathematician is to find suitable coordinate functions.” He also noted that “If these functions are chosen without proper regard for the individuality of the problem the task of computation will become hopeless.” This leads me to the following conjecture: Efficient nonparametric data transformation or representation scheme is the basis for almost all successful learning algorithms–the Scientific Core of Data Analysis–that should be emphasized in research, teaching, and practice of 21st century Statistical Science to develop a systematic and unified theory of data analysis (Foundation of data science).]]>

Filed Under: Blog Tagged With: 21st-century statistics, Core of Data Analysis, Data Science, Next-Generation Statisticians

Two Kinds of Mathematical Statisticians: Connectionist and Confirmatist

June 10, 2015 by deepstatorg

Connectionist: Mathematicians who invent and connect novel algorithms based on new fundamental ideas that address real data modeling problems. Confirmatist: Mathematicians who prove why an existing algorithm works under certain sets of assumptions/conditions (post-mortem report). Albeit, the theoreticians of the first kind (few examples: Karl Pearson, Jerzy Neyman, Harold Hotelling, Charles Stein, Emanuel Parzen, Clive Granger)  are much more rare than the second one. The current culture has failed to distinguish between these two types (which are very different in their style and motivation) and has put excessive importance on the second culture – this has created  an imbalance and often gives a wrong impression of what “Theory” means. We need to discover new theoretical tools that not only prove why the already invented algorithms work (confirmatory check) but also provide the insights into how to invent and connect novel algorithms for effective data analysis – 21st-century statistics.]]>

Filed Under: Blog Tagged With: 21st-century statistics, Confirmatory Theory, Exploratory Theory, Next-Generation Statisticians

Impact: The way I see it

May 16, 2015 by deepstatorg

Theoretical beauty  x  Practical utility  =  Impact of your work.

  • By Theoretical Beauty, I mean the ability/capacity of “Unification” of any concept/idea. (not proving consistency or rate of convergence).
  • Practical utility denotes the generic usefulness of the algorithm (simultaneously applicable for many problems) – Wholesale algorithms. (not just writing R-packages and coding).
  • The goal is to ensure that none of the quantities in the LHS of the equation are close to ZERO. Perfect balance is required to maximize the impact (which is an art).
]]>

Filed Under: Blog Tagged With: 21st-century statistics, impact, Next-Generation Statisticians

Models of Convenience to Useful Models

April 21, 2015 by deepstatorg

article by Mark van der Laan, which has a number of noteworthy aspects. I feel it’s an excellent just-in-time reminder, which rightly demands a change in perspective: “We have to start respecting, celebrating, and teaching important theoretical statistical contributions that precisely define the identity of our field.” The real question is which are those topics? Answer: which statistical concepts and tools are routinely used by non-statistician data scientists for their data-driven discovery? How many of them were discovered in the last three decades (and compare with the number of so-called “top journal” papers that get published every month!)? Are we moving in the right direction? Isn’t it obvious why “our field has been nearly invisible in key arenas, especially in the ongoing discourse on Big Data and data science.” (Davidian 2013). Selling the same thing under a new name will not going to help (in either research or teaching) ; we need to invent and recognize new ideas, which are beautiful & useful. I totally agree with what he said, “Historically, data analysis was the job of a statistician, but, due to the lack of rigor that has developed in our field, I fear our representation in data science is becoming marginalized.” I believe the first step is to go beyond the currently fashionable plug-and-play type model building attitude – let’s make it an Interactive and Iterative (thus more enjoyable) process based on few fundamental and unified rules. Another way of saying the same thing is, “the smartest thing on the planet is neither man nor machine – its the combination of the two” [George Lee]. He refers to the famous quote “All models are wrong, but some are useful.” He also expressed the concern that “Due to this, models that are so unrealistic that they are indexed by a finite dimensional parameter are still the status quo, even though everybody agrees they are known to be false.” To me the important question is: Can we systematically discover the useful ones rather than starting with a  guess solely based on convenience–typically two types: Theoretical and Computational.  (Classical) Theoreticians like to stay in the perpetual fantasy world of “optimality,”  whereas the (present-day) Computational goal is to make it “faster” by hook or crook. It seems to me that the ultimate goal is to devise a “Nonparametric procedure to Discover Parametric models” (The Principle of NDP), which are simple and better than “models of convenience.” Do we have any systematic modeling strategy for that? [An example]   “Stop working on toy problems, stop talking down theory, stop being attached to outdated statistical methods, stop worrying about the politics of our journals and our field. Be a true and proud statistician who is making an impact on the real world of Big Data. The world of data science needs us—let’s rise to the challenge.”]]>

Filed Under: Blog Tagged With: 21st-century statistics, Model discovery, Next-Generation Statisticians, Science of Model Building, Science of Statistics

  • Page 1
  • Page 2
  • Go to Next Page »

Primary Sidebar

Deep Mukhopadhyay

Deep Mukhopadhyay
Statistics Department
deep [at] unitedstatalgo.com

EDUCATION

  • Ph.D. (2013), Texas A&M University
  • M.S. (2008), Indian Institute of Technology (IIT), Kanpur
  • B.S. (2006), University of Calcutta, India

Footer

Follow Us

  • LinkedIn
  • Twitter
  • Skype

Contact Us

  • Email
    deep@unitedstatalgo.com
  • Address
    Department of Statistics
    Sequoia Hall, 390 Serra Mall
    Stanford, CA 94305

Read Recent Blogs

  • Could Einstein’s Work Get Published Today?
  • What's The Point of Doing Fundamental Science?
  • Two sides of Theoretical Data Science: Analysis and Synthesis

Copyright © 2025 · eleven40 Pro on Genesis Framework · WordPress · Log in