Wednesday, November 09, 2005

Learn from my mistakes....

....and start using the do-file editor.

A couple of you have asked me why I insist on using the do-file editor rather than punching the commands directly in Stata. The reasons are several, and they are all tied into a data set that I was using over the summer:

  1. The do-file is a convenient way to maintain all of your required data revisions, so you don't have to do them again....and again.....and again. For instance, if I wanted a particular subsample within a given dataset and I know I'll have to keep using this data, rather than trying to remember all of the filters I applied to it, I just have to refer to my do-file (or even better - just run it) and it's there - every time.
  2. One time, I came up with a particularly good estimation that fit with my intuition by including some additional variables. I left it alone for a couple of weeks, and then tried to draw them from memory. Didn't work out very well. I should have maintained this new estimation in my do-file.
  3. Stata prompts you in the end if you want to save the changes that you've made to the data. It could be something benign, like keeping the variable age-squared in your file ,or it could be that you dropped a few variables and Stata would like to know if you'd like to keep this change permanent. Bad idea. Really. I made a revision permanent to a dataset that I painstakingly compiled over several days. Not only do you lose your data, but if you're tempted to run your estimations again on the revised dataset, you're never quite sure if you're including things that should be in there. Cardinal rule number 1 - never permanently modify your original sample - even if you're sure that the revision is 'safe'. Better to keep your modifications limited to your do-file. I cannot emphasize this enough.
Feel free to let me know if you have any questions. Better yet, ask Prof. Deb. If you don't know what a do-file looks like, I'll post one on the blog that I made for Homework 5 on Friday.

No comments: