Anonymizer for Polish Language
dc.contributor.author | Walkowiak, Tomasz | |
dc.contributor.author | Gniewkowski, Mateusz | |
dc.contributor.author | Pogoda, Michał | |
dc.contributor.author | Ropiak, Norbert | |
dc.date.accessioned | 2023-09-22T11:01:56Z | |
dc.date.available | 2023-09-22T11:01:56Z | |
dc.date.issued | 2023 | |
dc.description.abstract | Researchers and enterprises require anonymization of unstructured text. This is not only due to the GDPR regulation, but also due to the increasing use of large language models (LLMs) such as GPT-3, where there is growing concern about the privacy and security risks associated with these models. The texts to be processed by such models need to be anonymized beforehand, and very often they need to be anonymized at the data providers’ premises rather than at the machine learning teams. In this paper, we present an effective anonymization pipeline for Polish. It provides a modular and configurable solution that employs different modes, including the challenging pseudo-anonymization mode in languages with complex inflectional systems. The system can be easily integrated with existing systems and deployed in different environments using a microservices architecture solution with a REST interface. | en_EN |
dc.identifier.citation | Walkowiak T., Gniewkowski M., Pogoda M., Ropiak N., Anonymizer for Polish Language. W: Progress in Polish Artificial Intelligence Research 4, Wojciechowski A. (Ed.), Lipiński P. (Ed.)., Seria: Monografie Politechniki Łódzkiej Nr. 2437, Wydawnictwo Politechniki Łódzkiej, Łódź 2023, s. 281-284, ISBN 978-83-66741-92-8, doi: 10.34658/9788366741928.44. | |
dc.identifier.doi | 10.34658/9788366741928.44 | |
dc.identifier.isbn | 978-83-66741-92-8 | |
dc.identifier.uri | http://hdl.handle.net/11652/4820 | |
dc.identifier.uri | https://doi.org/10.34658/9788366741928.44 | |
dc.language.iso | en | en_EN |
dc.page.number | s. 281-284 | |
dc.publisher | Wydawnictwo Politechniki Łódzkiej | pl_PL |
dc.publisher | Lodz University of Technology Press | en_EN |
dc.relation.ispartof | Wojciechowski A. (Ed.), Lipiński P. (Ed.)., Progress in Polish Artificial Intelligence Research 4, Seria: Monografie Politechniki Łódzkiej Nr. 2437, Wydawnictwo Politechniki Łódzkiej, Łódź 2023, ISBN 978-83-66741-92-8, doi: 10.34658/9788366741928. | |
dc.rights | Dla wszystkich w zakresie dozwolonego użytku | pl_PL |
dc.rights | Fair use condition | en_EN |
dc.rights.license | Licencja PŁ | pl_PL |
dc.rights.license | LUT License | en_EN |
dc.subject | natural language processing | en_EN |
dc.subject | anonymization | en_EN |
dc.subject | Polish language | en_EN |
dc.subject | Kubernetes | en_EN |
dc.subject | przetwarzanie języka naturalnego | pl_PL |
dc.subject | anonimizacja | pl_PL |
dc.subject | język polski | pl_PL |
dc.subject | Kubernetes | pl_PL |
dc.title | Anonymizer for Polish Language | en_EN |
dc.type | Rozdział - monografia | pl_PL |
dc.type | Chapter - monograph | en_EN |
Pliki
Oryginalne pliki
1 - 1 z 1
Brak miniatury
- Nazwa:
- 44. Anonymizer_polish_language_Walkowiak_Gniewkowski_2023.pdf
- Rozmiar:
- 216.23 KB
- Format:
- Adobe Portable Document Format
- Opis:
Licencja
1 - 1 z 1
Brak miniatury
- Nazwa:
- license.txt
- Rozmiar:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Opis: