I don’t have data readily at hand from the beginning of the project. And in 1995 and 1996 I continued to do research, but stopped editing text, because I was pulled away to finish Mathematica 3 (and the book about it). But otherwise one sees inexorable progress, as I systematically worked out each chapter and each area of the science. One can see the time it took to write each chapter (Chapter 12 on the Principle of Computational Equivalence took longest, at almost two years), and which chapters led to changes in which others. And with enough effort, one could drill down to find out when each discovery was made (it’s easier with modern Mathematica automatic history recording). But in the end -- over the course of a decade -- from all those individual keystrokes and file modifications there gradually emerged the finished A New Kind of Science.
It’s amazing how much it’s possible to figure out by analyzing the various kinds of data I’ve kept. And in fact, there are many additional kinds of data I haven’t even touched on in this post. I’ve also got years of curated medical test data (as well as my not-yet-very-useful complete genome), GPS location tracks, room-by-room motion sensor data, endless corporate records -- and much much more.
And as I think about it all, I suppose my greatest regret is that I did not start collecting more data earlier. I have some backups of my computer filesystems going back to 1980. And if I look at the 1.7 million files in my current filesystem, there’s a kind of archeology one can do, looking at files that haven’t been modified for a long time (the earliest is dated June 29, 1980).
Here’s a plot of the latest modification times of all my current files:
The colors represent different file types. In the early years, there’s a mixture of plain text files (blue dots) and C language files (green). But gradually there’s a transition to Mathematica files (red) -- with a burst of page layout files (orange) from when I was finishing A New Kind of Science. And once again the whole plot is a kind of engram -- now of more than 30 years of my computing activities.
So what about things that were never on a computer? It so happens that years ago I also started keeping paper documents, pretty much on the theory that it was easier just to keep everything than to worry about what specifically was worth keeping. And now I’ve got about 230,000 pages of my paper documents scanned, and when possible OCR’ed. And as just one example of the kind of analysis one can do, here’s a plot of the frequency with which different 4-digit "date-like sequences" occur in all these documents:
Of course, not all these four-digit sequences refer to dates (especially for example "2000") -- but many of them do. And from the plot one can see the rather sudden turnaround in my use of paper in 1984 -- when I turned the corner to digital storage.
What is the future for personal analytics? There is so much that can be done. Some of it will focus on large-scale trends, some of it on identifying specific events or anomalies, and some of it on extracting "stories" from personal data.
And in time I’m looking forward to being able to ask Wolfram|Alpha all sorts of things about my life and times -- and have it immediately generate reports about them. Not only being able to act as an adjunct to my personal memory, but also to be able to do automatic computational history -- explaining how and why things happened -- and then making projections and predictions.
As personal analytics develops, it’s going to give us a whole new dimension to experiencing our lives. At first it all may seem quite nerdy (and certainly as I glance back at this blog post there’s a risk of that). But it won’t be long before it’s clear how incredibly useful it all is -- and everyone will be doing it, and wondering how they could have ever gotten by before.
And wishing they had started sooner, and hadn’t "lost" their earlier years.
from Hacker News https://ift.tt/3w3frvQ
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.