# Which observations should I use for my panel data analysis?

Tue 22 Aug, 2023 04:16 am
I am currently working on my Master's thesis and for this I would like to examine the impact of the geographical location of Mutual Funds on their performance. For this I have panel data of mutual funds with their performance and the respective locations. The data is annual. I have data from 1963 up to 2022, but the data is not complete. What I mean by that is that I have data from some funds for example only from one year or from two years. From some funds I have the data over for example 10 years. From practically no funds I have the data over all years. Now I wanted to ask how best to do a panel regression (please let me know if that is the wrong approach). What is the most appropriate time period for the analysis. For example, I would have taken only data from 2000 onwards, as the data before that might be too far away and not decisive. My next question would be what funds do I take in the analysis. Should I take all of them or only those of which I have a certain number of observations over the years and how many should that be?

engineer

Tue 22 Aug, 2023 09:56 am
@flovb,
You have your primary independent variable (location), but you also have very significant secondary independent variables: overall market performance, fund objective, fund size, etc. Comparing a growth fund to a stable value fund would produce wildly different results in an up year than a down year. That said, I can see two ways of doing this.

The first is to restrict your data to certain years. If I were doing this, I would select three years, one where the market was very good, one where it was very bad and one where it was flat or in line with the average market performance over many years. Then I would group the funds into similar size and objective groups, look at all funds that were in existence at that time, had at least X years of history behind them and met a certain size threshold. (Alternatively, you could make size another independent variable).

The second (and I think better) way would be normalize the performance of all funds to the market. For each fund type, compare them to a standard market index (say the SP500) and then group as above. Then you could use all years. If you do this, I would also make the market index an independent variable in your analysis.
