** For New Graduates/Analysts **
We talked about analyzing and presenting a large and complex dataset in 30 minutes in the prior blog post. Would one handle it differently if one had 60 minutes? Here is one approach one might like to consider:
1. While starting out, many young folks tend to underestimate themselves. The very fact that one has been tasked with this critical presentation speaks volumes, so one must learn to take full advantage of this visibility in narrowing the (internal) competition down. These meetings are often frequented by other department heads and high-level client representatives, leading to significant loss of time in unrelated (business) discussions. The best way to prepare for such contingencies is to split the presentation into a two-phase solution where phase-1 leads seamlessly to phase-2.
2. In a business environment, it's never a good idea to start with a complicated stat/econ model; instead, one must start a bit slow but use one's analytical acumen and presentation skill to gradually force people to converge on the same page, retaining maximum control over the presentation in terms of both time and theme). Therefore, the phase-1 solution should be the same as the full 30-minute solution we detailed in a prior blog post (including the sub-market analysis). Even if the meeting leads to unrelated business chit-chat, off and on, the presenter will still be able to squeeze in the phase-1 solution, thus offering at least a baseline solution. Alternatively, if one has an all-encompassing solution, one could end up offering virtually nothing.
3. Now that the phase-1 presentation, establishing a meaningful baseline is over, one should be ready to transition to the higher-up phase-2 solution. In other words, it's time to show off one's modeling knowledge. The phase-1 presentation comprised a baseline Champ-Challenger analysis, where the Champ was the Monthly Median Sale Price, and the Challenger was the Monthly Median SP/SF. The presenter used the "Median" to avoid having to clean up the dataset for significant outliers. Here is the caveat of sales analysis though: Sales, individually, are mostly judgment calls; for example, someone bent on buying a pink house would overpay; an investor would underpay by luring a seller with a cash offer, etc. In the middle (middle 68% of the bell curve), the so-called informed buyers would use five comps, usually hand-picked by the salespeople, to value the subjects – not an exact science either.
4. Now, let's envision where the presenter would be at this stage – 30 minutes on hand and brimming with confidence. But it's not enough time to try to develop and present an accurate multi-stage, multi-cycle AVM. So, it's good to settle for a straight-forward regression-based modeling solution, allowing time for a few new slides to the original presentation. Ideally, the model should be built as one log equation with a limited number of variables (though covering all three major categories). The variables one might like to choose are: Living Area, Age, Bldg Style, Grade, Condition, and School/Assessing District, avoiding the 2nd tier variables (e.g., Garage SF, View, Site Elevation, etc.).
5. One should use Time Adjusted Sale Price (ASP) as the dependent variable in the Regression model, explaining the connection between the presentations (meaning phase-1 and 2) so the audience (including the big bosses like the SVP, EVP, etc.) understands that the two phases are not mutually exclusive, rather one is the stepping stone to the other. At this point, the presenter could face this question "Why did you split it up into two?" The answer must be short and truthful: "It's a time-based contingency plan."
6. At this point, the presenter must keep the regression output handy without inserting it into the main presentation, though, considering it is a log model (the audience may not relate to the log parameter estimates). If the issue comes up, the presenter should talk about the three critical aspects of the model: (a) the variable selection (how all of the three categories were represented), (b) the most vital variables as judged by the model (walking down on the t-stat and p-value), and (c) overall accuracy of the model (zeroing on the primary stats like r-squared, f-statistics, confidence, etc.).
7. The presenter must explain the model results in three simple steps: (a) Value Step: ASPs vs. Regression values, showing the entire percentile curve, 1st to 99th percentile rather than the median values only, and also pointing out the inherent smoothness of the Regression values vis-a-vis the ASPs; (b) Regression Step: How some arms-length sales could be somewhat irrational on both ends of the curve (<=5th and >=95th) and why the standard deviation of the Regression values was so much lower than ASP'; and (c) Ratio Step: Stats on the Regression Ratio (Regression Value to ASP) as it's easier to explain the Regression Ratios than the natural numbers so spending more time on the ratios would make the presentation more effective.
8. The presenter should explain the outlier ranges -- the ratios below the 5th and above the 95th percentile, or below 70 and above 143. Considering this is the outlier-free output, it's good to display Std Dev, COV, COD, etc. The outlier-free stats would be significantly better than the prior (with outliers) ones. Another common outlier question is: "Why no waterfront in your model?" The answer is simple: Generally, waterfront parcels comprise less than 5% of the population, hence challenging to test representativeness. (In an actual AVM, if sold waterfront parcels properly represent the waterfront population, it could be tried in the model, as long as it clears the multi-collinearity test as well).
9. Last but least, one must be prepared to face an obvious question: "What is the point of developing this model?" Here is the answer: "A sale price is more than a handful of top-line comps. It comprises an array of important variables like size, age, land, building characteristics, fixed and micro-locations, etc. so only a multivariate model can do justice to sell prices by properly capturing and representing all of these variables. The output from this Regression model is the statistically significant market replica of the sales population. Moreover, this model can be applied to the unsold population to generate significant market values. Simply put, this Regression model is an econometric market solution. Granted, the unsold population could be comp'd, but that's a very time-consuming and subjective process."
-Sid Som
homequant@gmail.com
No comments:
Post a Comment