19 sept. 2017

Tracking Scholarly Publishing of Hospitals Using MEDLINE, Scopus, WoS and Google Scholar

Pylarinou, S., & Kapidakis, S. (2017). 
Tracking Scholarly Publishing of Hospitals Using MEDLINE, Scopus, WoS and Google Scholar. 
Journal of Hospital Librarianship, 17(3), 209-216. 
http://dx.doi.org/10.1080/15323269.2017.1332934


Scientific literature focuses on facilitating communication among researchers. Many studies have been conducted to compare effectiveness, coverage, and performance among databases available to researchers and/or librarians. In this study, the authors compared MEDLINE/PubMed, Scopus, Web of Science (WoS), and Google Scholar performance regarding searching for scholarly publishing of institutions such as hospitals. 

Query searches of scholarly publications of specific hospital personnel run and articles results were compared.  The MEDLINE/PubMed database, Scopus and Web of Science offer the option to search by affiliation. Affiliations in Google Scholar can be searched by running a query as an exact phrase. Queries were phrased in a way that was suitable for each source as well as to enable comparison. To facilitate comparison we limited research to 2016. Data were collected at the end of August 2016.

Natural language use when authors denote affiliation affects retrieval of scholarly publishing. Effectiveness of searching scholarly publishing of a specific institution is better served when there is concurrent use of many databases.

An affiliation-based search could be better served if searchers use multiple sources in combination. In this study, a comparison of bibliographic database results gave precedence to MEDLINE/PubMed. Between free available resources MEDLINE/PubMed and Google Scholar, MEDLINE/PubMed provided better results also.

15 sept. 2017

Is Google Scholar useful for the evaluation of Chinese journals?

Zhang, Y., Lun, H., & Yang, Z. (2017)
Is Google Scholar Useful for the Evaluation of Non-English Scientific Journals? The Case of Chinese Journals
iConference 2017 Proceedings (pp. 241–261). https://doi.org/10.9776/17025


This study aims to explore how useful Google Scholar is for the evaluation of non-English journals with the case of Chinese journals. Based on a sample of 150 Chinese journals across two disciplines (Library and Information Science, Metallurgical Engineering & Technology), it provides a comparison between Google Scholar and Chongqing VIP, which is an important Chinese citation database, from three aspects: resource coverage, journal ranking and citation data. 

Results indicate that Google Scholar is equipped with sufficient resources and citation data for the evaluation of Chinese journals. However, the Chinese journal ranking reported by Google Scholar Metrics is not developed enough. But Google Scholar is able to be an alternative source of citation data instead of Chinese citation databases. The Average Citation is a useful metric in the evaluation of Chinese journals with data from Google Scholar to provide a comprehensive reflection of journals’ impact. Overall, Google Scholar is useful and worthy of attention when evaluating Chinese journals.




19 jul. 2017

Google Scholar Citations the system that covers more publications by an author: The cases of B Cronin and WG Stock

Publication hit lists of authors, institutes, scientific disciplines etc. within scientific databases like Web of Science or Scopus are often used as a basis for scientometric analyses and evaluations of these authors, institutes etc. However, such information services do not necessarily cover all publications of an author. The purpose of this article is to introduce a re-interpreted scientometric indicator called ‘‘visibility,’’ which is the share of the number of an author’s publications on a certain information service relative to the author’s entire oeuvre based upon his/her probably complete personal publication list. To demonstrate how the indicator works, scientific publications (from 2001 to 2015) of the information scientists Blaise Cronin (N = 167) and Wolfgang G. Stock (N = 152) were collected and compared with their publication counts in the scientific information services ACM, ECONIS, Google Scholar, IEEE Xplore, Infodata eDepot, LISTA, Scopus, and Web of Science, as well as the social media services Mendeley and ResearchGate. For almost all information services, the visibility amounts to less than 50%. The introduced indicator represents a more realistic view of an author’s visibility in databases than the currently applied absolute number of hits in those databases.


18 jul. 2017

Scientific information discovery: Still a mission of the academic library? Google & Google Scholar empire

Rodríguez-Bravo, B.; Simões, MG; Vieira-de-Freitas, MC; Frías, JA (2017)
Descubrimiento de información científica: ¿todavía misión y visión de la biblioteca académica? 
El profesional de la información, 26 (3): 464-479
https://doi.org/10.3145/epi.2017.may.13

Access to quality content is key to research and one of the core values that scholars assign to the library. Bibliographic data play a fundamental role in university libraries, which devote abundant resources to obtaining and hosting them for access.
This study investigates where and how bibliographic information is discovered, and highlights the role of search engines, databases, repositories, and web-scale discovery services in that process. The effort that libraries have made in implementing these services seems to have paid off in relation to the increase in the use of collections. However, Google remains the top option for discovering scientific information. This is a review study, based on the analysis of original research and results from recent reports.

17 jul. 2017

An evidence-based review of academic web search engines (Google Books, Google Scholar, Microsoft Academic), 2014-2016: Implications for librarians’ practice and research agenda

Academic web search engines have become central to scholarly research. While the fitness of Google  Scholar for research purposes has been examined repeatedly, Microsoft Academic and Google Books  have not received much attention. Recent studies have much to tell us about Google Scholar’s  coverage of the sciences and its utility for evaluating researcher impact. But other aspects have been understudied, such as coverage of the arts and humanities, books, and non-Western, non-English  publications. User research has also tapered off. A small number of articles hint at the opportunity for  librarians to become expert advisors concerning scholarly communication made possible or enhanced by these platforms. This article seeks to summarize research concerning Google Scholar, Google Books, and Microsoft Academic from the past three years with a mind to informing practice and setting a research agenda. Selected literature from earlier time periods is included to illuminate key findings and to help shape the proposed research agenda, especially in understudied areas.

14 jul. 2017

Suitability of Google Scholar as a source of scientific information and as a source of data for scientific evaluation. Review of the Literature


Studies comparing GS to controlled databases such as Scopus, Web of Science (WOS) and others have been published almost since GS inception. These studies focus on its coverage, quality and ability to replace controlled databases as a source of reliable scientific literature. In addition, GS introduction of citations tracking and journal metrics have spurred a body of literature focusing on its ability to produce reliable metrics. In this article we aimed to review some studies in these areas in an effort to provide insights into GS ability to replace controlled databases in various subject areas. We reviewed 91 comparative articles from 2005 until 2016 which compared GS to various databases and especially Web of Science (WOS) and Scopus in an effort to determine whether GS can be used as a suitable source of scientific information and as a source of data for scientific evaluation. Our results show that GS has significantly expanded its coverage through the years which makes it a powerful database of scholarly literature. However, the quality of resources indexed and overall policy still remains known. Caution should be exercised when relying on GS for citations and metrics mainly because it can be easily manipulated and its indexing quality still remains a challenge.



Highlights
• Google Scholar is constantly expanding and includes publishers content as well as content not available in controlled databases.
• Google Scholar provides citations counts that are broader than those covered by controlled databases.
• Google Scholar should be used with controlled databases especially when clinical information retrieval is required.
• Google Scholar is challenging when advanced searching is required.
• Google Scholar does not support data downloads and therefore is difficult to use as a sole bibliometric source.
• Google Scholar lacks quality control and clear indexing guidelines.

4 jul. 2017

Faculty Use of Author Identifiers and Researcher Networking Tools

This cross-sectional survey focused on faculty use and knowledge of author identifiers and researcher networking systems, and professional use of social media, at a large state university. Results from 296 completed faculty surveys representing all disciplines (9.3% response rate) show low levels of awareness and variable resource preferences. The most utilized author identifier was ORCID while ResearchGate, LinkedIn, and Google Scholar were the top profiling systems. Faculty also reported some professional use of social media platforms. The survey data will be utilized to improve library services and develop intra-institutional collaborations in scholarly communication, research networking, and research impact.


28 jun. 2017

Classic papers: déja vu, a step further in the bibliometric exploitation of Google Scholar



After giving a brief overview of Eugene Garfield’s contributions to the issue of identifying and studying the most cited scientific articles, manifested in the creation of his Citation Classics, the main characteristics and features of Google Scholar’s new service -Classic Papers-, as well as its main strengths and weaknesses, are addressed. This product currently displays the most cited English-language original research articles by fields and published in 2006.
What does Google Scholar’s Classic Papers offer?



The top 10 most cited English-language original research articles published in 2006 in each of 252 subject categories, according to the data available in Google Scholar as of May 2017. The total number of articles displayed in the product is 2515 articles .

In order to make it to this product, articles must meet the following criteria:

˗ They must have been published in 2006
˗ They must be journal articles, articles deposited in repositories, or conference communications.
˗ The documents must describe original research. Review articles, introductory articles, editorials, guides, commentaries, etc. are explicitly excluded.
˗ They must be written in English.
˗ They must be among the top 10 most cited documents in their respective subject category.
˗ They must have received at least 20 citations.
This product, as could not be otherwise, has the identifying traits of most of Google’s products:

- Simple and straightforward: a list of the most cited articles in each discipline, with a simple browsing interface.
- Easy to use and understand: organized by broad scientific areas and inside of them by subject categories. Three clicks are enough to reach the documents or the public Google Scholar Citations profiles of their authors.
- Minimal information: As a whole, the product displays just over 2500 highly cited articles. Each article presents the most basic bibliographic information.
- Little methodological transparency: It is common for Google Scholar not to declare in detail how their products are developed.

Regarding the las point, there are four critical aspects about which we should know more precise information. They are aspects that could compromise the reliability and validity of the product:
The first of them is related to what Google understands as a research article
The second aspect has to do with the subject classification of the articles.
The third aspect has to do with another crucial issue related to the way Google Scholar works: can we be sure they have successfully merged together all the versions indexed in Google Scholar of these documents? 
The fourth aspect has to do with the threshold selected to consider an article a “classic paper”

20 jun. 2017

A Novel Improvement to Google Scholar Ranking Algorithms Through Broad Topic Search

Google Scholar uses ranking algorithms to find the most relevant academic research possible. However, its algorithms use an exact keyword match and citation count to sort its results. This paper presents a novel improvement to Google Scholar algorithms by aggregating multiple synonymous searches into one set of results, offsetting the necessity to guess all potential search phrases for a research topic. This design science research method uses a broad topic analysis that examines search queries, finds synonymous phrases, and combines all keyword searches into one set of results based on current Google Scholar citation count algorithms. To support and evaluate this research-in-progress, several users will compare multiple niche search queries against old and new algorithms. The expectation of this design is to introduce modern algorithm techniques to academic search engines, resulting in greater quality, discoverability, and core topic diversity of published research.



17 abr. 2017

On Low Overlap Among Search Results of Academic Search Engines: Google Scholar, Semantic Scholar Microsoft Academic, Scopus

Mitra, A., & Awekar, A. (2017)
On Low Overlap Among Search Results of Academic Search Engines. 



Introduction To tackle this information overload, researchers are increasingly depending on niche academic search engines. Recent works have shown that two major general web search
engines: Google and Bing, have high level of agreement in their top search results.Various works have tried to predict coverage of academic search engines. However to the best of our knowledge, there is no existing work that systematically studies overlap in the search results of academic search engines.
MethodsWe collected over 2300 query terms from 2012 ACM Computing Classification System. We sent these 2500 queries to all four selected academic search engines. For each academic search engines, we looked at top eight results as each academic search engines returns at least eight results on the first page. We computed similarity between any two sets using the Jaccard similarity. 

ResultsFor all 2500 queries, intersection set of all four academic search engines considered together was always empty. In other words, for each query, no research article appears in the top results list of all four academic search engines. This shows strong diagreement among academic search engines
Therefore we observe that overlap in search result sets of any pair of academic search engines is significantly low and in most of the cases the search result sets are mutually exclusive.


6 mar. 2017

Use of Google Scholar public profiles in orthopedics: Rate of growth and changing international patterns

Tetsworth, K., Fraser, D., Glatt, V., & Hohmann, E. 
Use of Google Scholar public profiles in orthopedics: 
Rate of growth and changing international patterns. 
Journal of Orthopaedic Surgery, 2017, 25(1), 
doi.org/10.1177/2309499017690322


Introduction: The purpose of this study was to survey the growth of Google Scholar public profiles in orthopedics over a 12-month period and to investigate global patterns. Methods: Data was prospectively acquired from June 2013 to June 2014. Google Scholar queries specific to orthopedic surgery were performed at 90-day intervals. Demographic aspects of each user were also compiled, including gender, current location, and primary interests. To determine differences between the growth of Google Scholar public profile registrations and citation counts, as well as differences in growth in different regions, repeated measures of analysis of variance (RMANOVA) were used. 
Results: RMANOVA revealed statistically significant differences (p ¼ 0.0001) for regional growth. The largest growth was observed in the United Kingdom (p ¼ 0.009, 289%), followed by the Asia-Pacific region (p ¼ 0.004, 177%) and “Other” (p ¼ 0.006, 172%). The mean growth per 90-day interval is 19.9% (p ¼ 0.003) and the mean 12-month growth is 107% (p ¼ 0.05). Statistically significant differences between gender (male vs. female) and basic and clinical sciences (w2 ¼ 22.4, p ¼ 0.0001) were observed. Conclusion: This study suggests an exponential growth in the number of authors in the field of orthopedic surgery creating a Google Scholar public profile, and at the current rate participation doubles every 10.6 months.

9 ene. 2017

A two-sided academic landscape: snapshot of highly-cited documents in Google Scholar (1950-2013)

Alberto Martin-Martin, Enrique Orduna-Malea, Juan M. Ayllón, Emilio Delgado López-Cózar
 A two-sided academic landscape: snapshot of highly-cited documents in Google Scholar (1950-2013) 
Revista Española de Documentación Científica, 39(4): e149
DOI 10.3989/redc.2016.4.1405
Access to the Full Text

OBJECTIVES
The main objective of this paper is to identify the set of highly-cited documents in Google Scholar and define their core characteristics, in order to give an answer to the following research questions:
• Which are the most cited documents in Google Scholar?
• Which is the most frequent document type for these highly-cited documents?
• In what languages are the most cited documents written?
• How many highly-cited documents are freely accessible?
• What are the most common file formats to store these highly cited documents?
• Which are the main providers of these highlycited full text documents?
METHODOLOGY
Sample
64,000 documents published entre 1950-2013 (1000 per year)
Design
A longitudinal analysis was carried out by performing 64 keyword-free year queries from 1950 to 2013 (one query per year). All the records displayed (a maximum of 1,000 per query) were
extracted, obtaining a final set of 64,000 records. 
This process was carried out twice (on the 28th of May, and on the 2nd of June, 2014)
Period analyzed
1950-2013
RESULTS
      Which are the most cited documents in Google Scholar?
The most cited document according to GS is the aforementioned article by Lowry et al, with 253,671 citations (as of May 2014), followed by Laemmly’s article, with 221,680 citations.
Although the ranking is dominated by studies from the natural sciences (especially the life sciences), it also contains many works from the social sciences (especially from economics, psychology and sociology), and also from the humanities (philosophy and history). 
-  Many of the works in this ranking are methodological in nature: they describe the steps of a certain procedure or how to handle basic tools to process and analyse data. This is exemplified by the presence of manuals (statistical, laboratory, research methodology), and works that have become a de facto standard in professional practice
In fact, books are the most common category among the top 1% most cited documents, constituting the 62% (395) of this subsample, followed by journal articles with 36.01% (231). Moreover, the citation average of books (2,700) is higher than that for journal articles (1,700)
Which is the most frequent document type for these highly-cited documents?
- The document type has been identified in 71% (45,440) of the documents sampled, whereas the
typology of the other 29% (18,590) remained unknown.
- Predominance of journal articles (including reviews, letters and notes as well) which represent 51% of the total 64,000 documents (72.3% of the documents with a defined document type). Book and book chapters together also make up a big part of the sample (18%; 11,240 items) while the presence of conference proceedings and other typologies (meeting abstracts, corrections, editorial material, etc.) is merely testimonial (1% each). (Fig. 1)
 In what languages are the most cited documents written?
English dominates over the rest of the languages as the most widely used language for scientific communication in Google Scholar, accounting for 92.5% of all the documents. The second and third places are occupied by Spanish and Portuguese respectively but neither of them reaches even 2% of the total (Fig. 2)
How many highly-cited documents are freely accessible?
A free full-text link is provided for 40% (25,849) of all the highly-cited documents retrieved (Figure 3; top). We can also observe a positive trend through the analyzed period (from 25.93% of documents with free full-text links in the period 1950-1959, to 66.84% in 2000-2009).
- What are the most common file formats to store these highly cited documents?
The most common one isvthe pdf format (86.0% of all full text documents), followed by the html format (12.1%). The remaining identified file formats (doc, ps, txt, rtf, xls, ppt) together only represent 1.9% of the freely available documents. The predominance of the pdf format is patent throughout the entire range of years (Fig. 4)
- Which are the main providers of these highly cited full text documents?
A total of 5,715 different providers of free full-text links to highly cited documents have been found in the sample. However, a group of 35 providers (18 universities; 5 scientific societies; 4 publishers; 2 companies; 2 public administrations; 1 journal; 1 digital library; 1 repository; 1 academic social network) account for more than a third of all the links (37%). If we analyse the top-level domains of the 25,849 links to full text available documents the most frequent are academic institutions (.edu; 23.74%) and organizations (.org; 21.39%)
- Versions


83.17% (53,229) of the documents analyzed have more than one version (Table IV).

CONCLUSIONS

In light of the results obtained, we can conclude that Google Scholar offers an original and different vision of the most influential documents in the academic/scientific environment (measured from the perspective of their citation count). These results are a faithful reflection of the allencompassing indexing policies that enable Google Scholar to retrieve a larger and more diverse number of citations, since they come from a wider range of document types, different geographical environments, and languages.
Therefore, Google Scholar covers not only seminal research works in the entire spectrum of the scientific fields, but also the greatly influential works that scientists, teachers and professionals who are training to become practitioners use in their respective fields. This phenomenon is particularly true for works that deal with new data collecting and processing techniques..


What this study adds

Thanks to the wide and diverse list of sources from which Google Scholar feeds, this search engine covers academic documents in a broader sense, enabling the measurement of impact stemming not only from the scientific side of the academic landscape, but also from the educational side (doctoral dissertations, handbooks) and from the professional side (working papers, technical reports, patents), the last two being areas that haven’t been explored as much as the first one.

4 ene. 2017

Can we use Google Scholar to identify highly-cited documents?

Alberto Martin-Martin, Enrique Orduna-Malea, Anne-Wil Harzing, Emilio Delgado López-Cózar
  Can we use Google Scholar to identify highly-cited documents? 
Journal of Informetrics, 2017, 11(1), 152-163
DOI 10.1016/j.joi.2016.11.008
Access to the Full Text

OBJECTIVES
This paper has two main objectives:
1. Verify whether it is possible to reliable identify the most highly-cited papers in Google Scholar, and indirectly
2. Empirically validate whether citations are the primary result-ordering criterion in Google Scholar for generic queries orwhether other factors substantially influence the rank order
METHODOLOGY
Sample
64,000 documents published entre 1950-2013 (1000 per year)
Design
A generic query through conducting a null query (search box is left blank), filtering only by publication year using Google Scholar’s advanced search function. In this way, we avoided the sampling bias caused by the keywords ofa specific query and by other academic search engine optimisation issues. In order to work with a sufficiently large data sample, a longitudinal analysis was carried out by performing 64 generic null queries from 1950 to 2013 (one query per year). Whereas 2013 was the last complete available year when our data collection was carried out, 1950 was selected becausethis particular year reflected an increase in coverage in comparison to the preceding years
Period analyzed
1950-2013
RESULTS
The overall correlation between the number of citations received by the 64,000 documents and the position they occupied on the results page of Google Scholar at the time of the query is r = −0.67 ( < 0.05). The average annual value of the correlation coefficient is very high (negative values for the correlation are due to position1 being better than position 1000). Fig. 1
- The correlation for the results placed amongst the top 900 positions is r = 0.97 ( < 0.01). However, the correlation obtained for results in the last 100 positions is only r = 0.61 ( < 0.01). the results located in the first 900 positions of each search are displayed in green, while the results in the last 100 positions are shown in red (Fig. 2). In this way we can see clearly how, until approximately the 900th position, the Google Scholar sorting criteria are based largely on the number of citations received by each result. However, after approximately the 900th position, the data show erratic results in terms of the correlation between citations and position (Fig 2.)
- The correlation between the position of a document and the number of versions is low, but significant (r = −0.30; < 0.01).The average correlation per year is slightly higher (r = −0.33; = 0.04). Fig. 6 shows that, despite the wide dispersion of data,there is a slight concentration of documents with between 100 and 300 versions amongst the first 100 rank positions (Fig. 3)
The annual average number of documents in English for results within the first 100 positions is 99.5. Therefore, thepresence of documents in other languages within this range is abnormal. When analysing this same percentage for the documents in the last 100 positions, the results change significantly. The annual average drops to 34.2%. (Fig. 4)



CONCLUSIONS

Significant and high correlation between the number of citations and the ranking of the documents retrieved by Google Scholar was obtained for a generic query filtered only by year. The fact that we minimised the effects of academic search engine optimisation, together with the size of the sample analysed (64,000 documents), leads us to conclude that the number of citations is a key factor in the ranking of the results and, therefore, that Google Scholar is able to identify highly-cited papers effectively. Given the unique coverage of Google Scholar (no restrictions on document type and source), this makes it an invaluable tool for bibliometric analysis.



What this study adds

Google Scholar can be used to reliably identify the most highly-cited academic documents. Given its wide and varied coverage, Google Scholar has become a useful complementary tool for Bibliometrics research concerned with the identification of the most influential scientific works