20 dic 2016

H-index manipulation by merging articles in Google Scholar Profiles: Models, theory, and experiments

R van Bevern, C Komusiewicz, R Niedermeierd, M Sorged, T Walsh
H-index manipulation by merging articles: Models, theory, and experiments
Artificial Intelligence 2016, 240: 9–35

The H-index is a widely used measure for estimating the productivity and impact of researchers, journals, and institutions. Several publicly accessible databases such as AMiner, Google Scholar, Scopus, and Web of Science compute the H-index of researchers. Such metrics are therefore visible to hiring committees and funding agencies when comparing researchers and proposals. 

Although the H-index of Google Scholar profiles is computed automatically, profile owners can still affect their H-index by merging articles in their profile. The intention of providing the option to merge articles is to enable researchers to identify different versions of the same article. This may decrease a researcher’s H-index if both articles counted towards it before merging, or increase the H-index since the merged article may have more citations than each of the individual articles. Since the Google Scholar interface permits to merge arbitrary pairs of articles, this leaves the H-index of Google Scholar profiles vulnerable to manipulation by insincere authors.


1. We propose two further ways of measuring the number of citations of a merged article. One of them seems to be the measure used by Google Scholar.

2. We propose a model for restricting the set of allowed merge operations. Although Google Scholar allows merges be-tween arbitrary articles, such a restriction is well motivated: An insincere author may try to merge only similar articles in order to conceal the manipulation.

3. We consider the variant of H-index manipulation in which only a limited number of merges may be applied in order to achieve a desired H-index. This is again motivated by the fact that an insincere author may try to conceal the manipulation by performing only few changes to her or his own profile.

4. We analyze each problem variant presented here within the framework of parameterized computational complexity. That is, we identify parametersp—properties of the input measured in integers—and aim to design fixed-parameter algorithms, which have running timef(p) ·nO(1)for a computable functionfindependent of the input sizen. In some cases, this allows us to give efficient algorithms for realistic problem instances despite the NP-hardness of the problems in general. We also show parameters that presumably cannot lead to fixed-parameter algorithms by showing some problem variants to be W[1]-hardfor these parameters.

5. We evaluate our theoretical findings by performing experiments with real-world data based on the publication profiles of AIresearchers. In particular, we use profiles of some young and up-and-coming researchers from the 2011 and 2013 editions of the IEEE “AI’s 10 to watch” list.


No hay comentarios:

Publicar un comentario