Today, we want to give you a bit of insight in our day-to-day work. In our department, we explore gender issues in asset management – check out our essential readings on the topic. The first step is, obviously, to correctly identify a fund manager’s gender.
Finding the right algorithm
Seems easy? In fact, it’s a rather involved algorithm you have to apply – because no publicly available database records a fund manager’s gender! For a study we are currently working on, we did the following:
- We downloaded annual summary statistics for diversified domestic U.S. Equity funds from 1992 to 2015 in the CRSP database. This database is available at the University of Hohenheim via WRDS, courtesy of the DALAHO, and contains the names of the responsible fund managers, as well as a fund identifier.
- We dropped all summary terms such as “Team managed” or “Team”, all lists of last names (e.g, Smith/Myers/Jones), and all names where we only have an initial (e.g., J. Smith), so that we end up with a focus on single-managed funds. This leaves us with a list of 11,114 (full) first/last name combinations.
- We then downloaded lists mapping first names to gender from the US Social Security Administration (SSA). The lists tell you whether a first name is (predominantly) male, (predominantly) female, or both, and has been used in studies such as Niessen-Ruenzi/Ruenzi (2017) or Adebambo/Yan (2016).
- The lists start at 1879. We summed all names into one list and dropped double entries that show up in more than one list.
- Last, we matched the first names of our fund managers to the first names in the SSA list to obtain a gender assignment for each of the 11,114 first/last name combinations.
Ambiguous gender assignments, …
So, what does the sample of matched first name look like? Check out Figure 1!
Out of 11,114 names, only 263 are determined to be either male or female (2.4%). In contrast, 10,850 of the names could be either male and female (97.6%). So, using the SSA list would force us to drop 98% of our sample!
… and almost no women in the sample!
Even accepting that we can only identify gender for such a small proportion, how many managers are male and how many are female? Check out Figure 2 to see this!
Figure 2 shows that 23 out of the 263 managers are female according to the SSA list. Clearly, any inference relying on 23 (out of initially 11,114) individuals is extremely dubious. Assume, for the moment, that the 9%/91% female/male ratio we find in the smaller group also holds for the entire sample. This indicates 1,000 female managers – and we would draw inference on their behavior from a group of 23!
Can better algorithms help identifying fund manager gender?
How can we increase the data base by identifying fund manager gender for more individuals? We can either use other lists – but this gave similar results. We can develop our own assignment algorithm, aka “manual determination”. But this is tricky, since names can be assigned to different genders in different cultures – “Andrea” is a male first name in Italy, and a female first name in Germany. Alternatively, we can compute the probability of a name referring either to a male or female manager based on census data. This could be done by using the total frequencies given by the SSA list – which we will report on in an upcoming post!