Thursday, June 22, 2017

Considering the possibility of retrenchment: its potential impact and my response

Well, well, what do ya know? Reporting season is upon us and my superiors have to report to top management as to whether our research KPIs have been met. From the topic title, you could probably guess that my team failed to hit their KPIs. Data collection has been abysmally slow and, without data and research findings, research funding may get cut.

When one of my bosses, let's call her Dr Janice (not her real name), implied the above, my fellow colleagues were in shock. Me? I was strangely calm. It was not until Dr Janice left the office that we all became chicken littles, freaking out over the possibility of job loss.


Next, as per typical human behaviour, my colleagues engaged in the blame game. "Dr Janice could have formulated the research participant inclusion criteria less stringently. That way, more participants could have entered into the study and this would have been a non-issue."

The above is one way to look at the current situation. An alternative perspective is to consider the operational workflow of the entire research process. All of my colleagues are data collection personnel. They handle administrative stuff, meet participants, and collect data. Then, they pass the data over to Unintelligent Nerd, the database developer/database administrator/data analyst/social science researcher guy (yeah, my current role is quite rojak; a small bit of everything) for data entry, database development, programming scripts to automate processes, and statistical analyses. 

After that, it goes to my three bosses (let's call them Dr Jason, Dr Jeremy, and Dr Janice). That's that.

If there is a bottleneck at data collection, we are screwed. How fast data is collected depends on my colleagues' social skills, rapport building, and pro-activeness. Unfortunately, my colleagues are not proactive individuals. If can slack today, then slack lah! Why so kan cheong to collect data? 

Speaking of which, I'm a kan cheong spider. I'm a Type A who likes to pia here and pia there while they are Type Bs who are more chill. As we are stationed in a satellite office away from the watchful eyes of our bosses, they have been reporting to work late by 2-3 hours. They have been caught multiple times when my bosses spring surprise visits on us. Hence, there's the possibility that the conversation is meant to imply "buck up!" to them (and is an empty threat?)

Yet I digress.

So, what will Unintelligent Nerd do in the case that the whole study folds? Let's consider my financial assets first. In light of the possibility of a market crash, I have been liquidating my stock portfolio over May and June, taking profits whenever Mr Market allows me to. Off the top of my head, I've liquidated around 30% of my portfolio. As a result, my cash/war chest grew from 5% of my net worth to around 25% of my net worth. Roughly, I have 3 months' worth of my gross pay in my emergency fund (in fact, I have diverted some of my profit-taking to my emergency fund instead of my war chest). Short-term, I should be doing okay if I do get retrenched.

What if I get retrenched and then the market crash? Well, I would still invest, even if more cautiously. To play safe, I might not even use up all of my war chest. Instead, I would bolster my emergency fund with my war chest!

In terms of getting a new job, I think I might do fine? My domain knowledge in social science + statistical know-how is still sought by social science research employers (see here). Now that I have more experience in statistical programming, I could also try applying for statistics-related jobs (see here). Anyway, since I've been pursuing breadth over depth (a lot of "beginner level" knowledge in various fields), I think I should be flexible enough to meet the requirements that employers would throw at me. Worst case scenario is to go become an insurance agent (see here). Hopefully that won't happen......(not my cup of tea, personality-wise).

What else have I learned from this episode? One of my colleagues adamantly believes that longitudinal studies suck. There is a lot to plan and many unknown-unknowns. If the unknown-unknowns are discovered at a late stage, correcting them may prove to be very difficult. I need to pick my bosses' brain more, to learn from their experience in this area.

Second, could you diversify your risk away? Dr Jason and Dr Jeremy have other studies they are involved in. If this study blows up, they still have their rice bowl. For Dr Janice, well.......you get the picture.

That's all for now. Future pay will either go into my emergency fund or my war chest. I'll only whack hard when the market correction comes.

Monday, June 19, 2017

Kadenze: the "Coursera" for creative professionals

While searching for Massive Open Online Courses (MOOCs) the other day, I got a pleasant surprise when I came across Kadenze.


According to its website, Kadenze is a MOOC provider that is geared towards Technology and the Arts. What's more interesting (to me at least) is that Kadenze offers a module on Deep Learning. Apart from Udacity's offering on Deep Learning, I know not of any other MOOC providers who offer courses delving into said subject. There are other courses on machine learning and data mining as well, just that it is contextualized to the arts field.

Soooooooo, what MOOC providers do we have now?

Coursera for all-purpose learning?

EdX for first runner-up to Coursera?

Udacity for hands-on tech education

FutureLearn from Open University

Khan Academy to build one's foundation?

Microsoft Professional Program

Udemy for bei kambings to part with their money when they can take similar courses for free from other MOOC providers

And last but not least, there's Kadenze.

Monday, June 12, 2017

Recognizing and seizing opportunities

Not too long ago, I came to learn of yet another individual from my field starting a statistics tutoring business.

I have toyed with the idea before. After all, quite a huge proportion of people from my field of social science are absolutely terrified of statistics. Some of them have even shared with me that their choice of major was predicated on the fact that the fields of arts, social sciences, and humanities lacked mathematics (and concomitantly, its sister subject, statistics). Well, their hopes were dashed since statistics is a course requirement. :P

Come assignment/thesis-submission and statistics exam time, panic-stricken faces are a ubiquitous sight in campus. Lecturers and tutors will have their timetable filled to the brim with students' request for help. For those who are a step too late in approaching the lecturers and tutors, stats nerds like yours truly will experience a sudden spike in friendliness from every Tom, Dick, and Harry. -_-

I digress.

A year back or so, one of my (now) ex-boss shared with me the same sentiments. During his tour as a Masters student, one compulsory task he had to undertake was to teach undergraduate social science students. From there, he experienced the same students' aversion towards statistics.

What he shared with me next made absolute sense. Of all the modules we are required to take, statistics is one of the few subjects that grants us social scientists a competitive advantage (the other subject being Cognitive Neuroscience. Okay, I'm being biased here). It is highly transferable across fields. Got a degree in social science, but wanna use something that serves as a foot-in-the-door to gain access to another industry? That will be statistics.

Meanwhile, the general social science populace still diss their best asset. They don't seem to realize that they possess a diamond in the rough right under their noses. An asset that they could nurture and hone, especially in light of the trends in society towards artificial intelligence/machine learning/statistics/data science/analytics.

Well, I'm not complaining. When the time comes, I shall go set up my own consultancy providing statistical solutions for people who are not adaptive enough or are too late to the game. Or better still. Build competency in the subjects that come after statistics. ;)

Random trivia: Did you know that there is such a phenomenon known as mathematical anxiety? That's what people experience if they fear mathematics. A similar phenomenon exists for statistics too. Just throw in "statistics anxiety" into Google.

Saturday, June 3, 2017

Yahoo Finance API discontinued

Not too long after I wrote my R script to automate the computation of Spearman's correlation of stock counters (see here), the script failed to work. A real bummer indeed. To cut to the chase, the underlying import function from Quantmod was down.

As expected, the Quantmod package developer received a flurry of questions from disgruntled users who were also unable to import the historical stock data from Yahoo Finance into R for data processing. Well, it wasn't the developer's fault. Apparently, Yahoo Finance stealthily decided to discontinue the data importing functionality for no rhyme or reason.

For those interested, see the Yahoo Help Community thread on the matter here.

As quite a lot of people depend on Yahoo Finance's historical stock data for their data processing/stock market app/personal needs, a petition has been made to Yahoo to reinstate the service. Quite a lot of people, I noticed, were even willing to pay a monthly fee in order to be able to download historical stock data into the software of their choice.

Based on some of the responses I have read, some people surmised that Yahoo Finance realized that downloads of historical stock data through third-party applications does not generate ad revenue for the company. Hence, removing such a functionality would encourage users to access stock information through their browser (and hopefully click on some of their ads!).

Frankly, I don't know why the Yahoo Finance team doesn't realize that they are sitting on a revenue goldmine and offer historical stock data downloads to desperate users who are willing to pay for said service.

The next best alternative is not really an option. Though Google Finance provides historical stock data of SGX counters, there is no way you could readily and conveniently import that same data into the software of your choice. The exception is if you are looking at NYSE and NASDAQ counters.

For now, the Quantmod package developer has found a workaround solution to the problem and my script works as intended. I'm just keeping my fingers crossed that it will continue that way.

Sunday, May 7, 2017

Correlations among the Fraser family members using R

So I've been fiddling around with R over the weekends. R is both the name of a programming language and an open-source software which is used for statistical computing. I'm using R in the workplace and it comes equipped with a full suite of packages for various kinds of data manipulation/analysis/etc.

Not too long ago, I wondered whether could R be used for investing-related purposes. True enough, there was a package known as quantmod which would be a treasure trove for traders. After some fiddling around, what I liked about quantmod is its ease in importing data from various financial sources (e.g. yahoo finance, google finance, etc) into R itself.

As I'm currently vested in some of the Fraser family members, I thought to myself how fun it would be if I could create a function in R that will be able to tell me how correlated the Fraser family members are.

Before proceeding further, there are some assumptions I have made. First, I've set the time period from 1 January 2016 to 31 December 2016. Second, I did not include Frasers Logistics & Industrial Trust as it does not have a full-year worth of data from the above-mentioned time period. Third, I assumed that there is a monotonic relationship among the Fraser family members. Monotonic relationships are less restrictive than linear relationships as linear relationships are monotonic, but not all monotonic relationships are linear. Therefore, I will be using the Spearman's correlation, which is suited for this task.

Basically, the function which I have written takes in three arguments: (a) the list of stocks to be correlated with one another, (b) the start date, and (c) the end date.

The function consists of the following steps:
1). use quantmod to import a list of stock symbols/tickers to be downloaded from yahoo finance
2). retain only the closing price of all the stock counters from the period between the start date to the end date (inclusive of both the start date and the end date as well)
3). join the closing prices together in one dataset, with each column representing one counter
4). produce scatterplot matrices and the Spearman's correlation table.

So, here's the scatterplot matrices produced by R:




















At first glance, I thought that there was something wrong with the output. For example, if you look at the scatterplot in the first column from the left, second row from the top, it has F&N on its x-axis and FCL on its y-axis. A mirror-image of that scatterplot could be found on the second column from the left, first row from the top. The change is that now F&N is on the y-axis and FCL is on the x-axis. The scatterplots do really look different from one another if the counters swapped axis! I've checked the underlying raw data and everything seems to be correct. Guess it must be the compression of the y-axis (relative to the x-axis) that causes the distortion in presentation.

What about the Spearman's correlation table?











Over the last year, the performance of Frasers Centrepoint Trust is positively associated with the performance of Frasers Commercial Trust. Frasers Hospitality Trust is least associated with the other counters in the Frasers family (the correlation coefficients with the other counters are generally smaller).

That's all for now.

In the meantime, I shall touch up on my programming code. I realized that I have no error handling mechanism in my code (e.g. if only one symbol/ticker is used as an input, it should throw up a warning statement instead of an error). Also, the scatterplots could be made more visually appealing (most probably with the ggplot2 package).

Readers, if you want to know whether a counter correlate with another counter, do drop a comment. I'm keen to test my function out further. =P

Just specify the list of counters, start date, and end date!