Preview

Scientific and Technical Libraries

Advanced search

Development of an algorithm for automating retroconversion for creating an electronic catalog

https://doi.org/10.33186/1027-3689-2025-2-144-161

Abstract

The authors substantiate the need for electronic catalogs that significantly simplify users’ access to relevant information. They formulate the difficulties of this process. The mentioned problems become especially acute for the libraries with a long history and large collections when they start conversion to the digital. The authors discuss the possibilities of bibliographic search expansion through scanning paper catalog cards. The ways to convert paper cards into digital format are described.

As part of the study, the advantages and disadvantages of each method for acquiring e-catalog were analyzed, and different technical tools were reviewed to find the most efficient solution for developing e-catalogs. Based on the analysis, through additional training and with the neural networks, the algorithm in the Python language was implemented, which allows to perform preprocessing tasks, to localize the necessary areas, to recognize text and, most importantly, to convert the scanned text into RUSMARC format fields and subfields. This algorithm accelerates retroconversion of bibliographic data as compared to the manual entry.

About the Authors

V. A. Korobkovsky
National Research University for Information Technologies, Mechanics and Optics
Russian Federation

Vadim A. Korobkovsky – Student of Master Level

St. Petersburg



N. N. Gorlushkina
National Research University for Information Technologies, Mechanics and Optics
Russian Federation

Natalia N. Gorlushkina – Cand. Sc. (Engineering), Associate Professor

St. Petersburg



M. A. Belinskaya
Library of Russian Academy of Sciences
Russian Federation

Maria A. Belinskaya – Head, Department for Information
Technologies and Automation

St. Petersburg



References

1. Stukalova A. A. Osnovny`e napravleniia razvitiia e`lektronny`kh katalogov GPNTB SO RAN // Trudy` GPNTB SO RAN. 2018. № 13–2. S. 185–192. DOI 10.20913/2618-7515-2018-2-185-192.

2. Skaruk G. A. E`lektronny`e katalogi bibliotek v bor`be za pol`zovatelia: «stary`e» i novy`e podhody` // Bibliosfera. 2016. № 2. C. 7–15. DOI 10.20913/1815-3186-2016-2-7-15.

3. Dovbnia E. V. Problemy` tematicheskogo poiska v e`lektronnom kataloge nauchnoi` biblioteki: obzor issledovanii` // Bibliotekovedenie. 2020. № 69 (4). C. 367–374. DOI 10.25281/0869-608X-2020-69-4-367-374.

4. Belinskaia M. A., Elkina N. N. Osnovny`e zadachi Biblioteki Rossii`skoi` akademii nauk v napravlenii ot «bukvy` k tcifre» // Bukva i tcifra: biblioteki na puti k tcifrovizatcii: sbornik docladov Tret`ei` nauchno-prakticheskoi` konferentcii «Biblio Peter-2022» (g. SanktPeterburg, 6–8 aprelia 2022 g.). S. 12–17. DOI 10.33186/978-5-85638-249-4-12-17.

5. Stepanov V. K. Manifest bibliotek tcifrovoi` e`pohi. 2014. URL: http://www.calameo.com/read/0034547383b7da70af379 (data obrashcheniia: 28.08.2024).

6. Brodovskii` A. I., Sboi`chakov K. O., Sokolovskii` V. V. Perspektivy` razvitiia sistemy` IRBIS: novy`i` produkt IRBIS64+ // Nauchny`e i tekhnicheskie biblioteki. 2017. № 11. C. 65–74. DOI 10.33186/1027-3689-2017-11-65-74.

7. Rossii`skii` kommunikativny`i` format predstavleniia bibliograficheskikh zapisei` v mashinochitaemoi` forme (rossii`skaia versiia UNIMARC). URL: http://www.rusmarc.ru/rusmarc/format.html (data obrashcheniia: 28.08.2024).

8. Skvortcov V. V. Formaty` MARC21, UNIMARC, RUSMARC, ikh nastoiashchee i budushchee. URL: http://www.rusmarc.ru/publish/mar.htm (data obrashcheniia: 30.08.2024).

9. Vakal T. S. E`lektronny`e biblioteki: problemy` sozdaniia i perspektivy` razvitiia // Molodoi` uchyony`i`. 2022. № 9 (404). S. 226–228. URL: https://moluch.ru/archive/404/89221/ (data obrashcheniia: 28.07.2024).

10. Sergeeva O. V. Retrokonversiia katalogov: sovremenny`i` opy`t i problemy` primeneniia // Teoriia i praktika obshchestvenno-nauchnoi` informatcii. 2004. № 19. URL: https://cyberleninka.ru/article/n/retrokonversiya-katalogov-sovremennyy-opyt-iproblemy-primeneniya (data obrashcheniia: 10.08.2024).

11. Retrokonversiia kartochny`kh katalogov: osnovny`e metody` : metodicheskie rekomendatcii / Arhangel`skaia oblastnaia nauchnaia biblioteka imeni N. A. Dobroliubova; Otdel formirovaniia dokument. fonda i organizatcii katalogov; [sost.: M. F. Zotova, K. S. Petrova]. Arhangel`sk, 2020. 17 s. URL: https://biblioteka29.ru/upload/medialibrary/928/retrokonversiyakatalogov.pdf (data obrashcheniia: 08.06.2024).

12. Voroi`skii` F. S. Organizatciia i tekhnologiia pererabotki kartochny`kh katalogov v mashinochitaemuiu formu dlia sozdaniia e`lektronny`kh katalogov. URL: https://www.gpntb.ru/win/ntb/ntb99/1/f0114.html (data obrashcheniia: 06.08.2024).

13. E`LAR. Svodny`i` e`lektronny`i` katalog. URL: https://elar.ru/resheniya/biblioteki/elektronnyekatalogiikollektsii/svodnyyelektronnyykatalog/ (data obrashcheniia: 15.08.2024).

14. Stukalova A. A. Retrospektivnaia konversiia kartochny`kh katalogov: opy`t rossii`skikh bibliotek // Bibliosfera. 2012. № 3. URL: https://cyberleninka.ru/article/n/retrospektivnayakonversiya-kartochnyh-katalogov-opyt-rossiyskih-bibliotek (data obrashcheniia: 05.07.2024).

15. Gaussova fil`tratciia. URL: https://russianblogs.com/article/7930400611/ (data obrashcheniia: 01.07.2024).

16. OpenCV Python Tutorials. Image Thresholding. URL: https://docs.opencv.org/4.x/d7/d4d/tutorialpythresholding.html (In Eng.). (Accessed: 10.07.2024).

17. Obnaruzhenie ob``ektov metodom Otcu. URL: https://habr.com/ru/articles/112079/ (data obrashcheniia: 10.07.2024).

18. Martcinkevich V. I., Larionova G. S., Tereshchenko V. V., Sitneykova K. A., Gorlushkina N. N. Analiz vozmozhnostei` parsinga e`lektronny`kh tekstovy`kh dokumentov dlia avtomatizatcii normokontrolia // E`konomika. Pravo. Innovatcii. 2022. № 3. S. 39–49. DOI 10.17586/2713-1874-2022-3-39-49.

19. Ultralytics YOLOv8 Docs. URL: https://docs.ultralytics.com/ (In Eng.) (Accessed: 18.06.2024).

20. EfficientNet PyTorch. URL: https://github.com/lukemelas/EfficientNet-PyTorch (In Eng.) (Accessed: 18.06.2024).

21. CVAT. URL: https://www.cvat.ai/ (In Eng.) (Accessed: 18.06.2024).

22. Shiftlab OCR. URL: https://github.com/konverner/shiftlabocr (In Eng.) (Accessed: 25.07.2024).

23. ResNet (34, 50, 101): «остаточные» CNN для классификации изображений. URL: https://neurohive.io/ru/vidy-nejrosetej/resnet-34-50-101/#pllswitcher (Accessed: 28.07.2024).


Review

For citations:


Korobkovsky V.A., Gorlushkina N.N., Belinskaya M.A. Development of an algorithm for automating retroconversion for creating an electronic catalog. Scientific and Technical Libraries. 2025;(2):144-161. (In Russ.) https://doi.org/10.33186/1027-3689-2025-2-144-161

Views: 257


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1027-3689 (Print)
ISSN 2686-8601 (Online)