Michael Anthony PRO
MikeDoes
AI & ML interests
Privacy, Large Language Model, Explainable
Recent Activity
reacted to theirpost with ๐ about 16 hours ago
Things our clients and open source actually said to us this year:
"Finally, someone built a synthetic PII training data for German."
"Does it cover have localised information? Not just the language, the actual format. That must have been a lot of work that we can save from our side."
"We operate in 12 EU countries. Your dataset is the only one that covers all of them which has helped us out a lot in compliance especially because it's synthetic."
Every language has strong PII localization names, addresses, IDs, phone numbers, dates in the real format of that country.
23 languages. 29 regions. 3 scripts. 1,428,143 examples.
100% synthetic. Zero real personal data. Free on Hugging Face. reacted to theirpost with ๐ about 16 hours ago
Things our clients and open source actually said to us this year:
"Finally, someone built a synthetic PII training data for German."
"Does it cover have localised information? Not just the language, the actual format. That must have been a lot of work that we can save from our side."
"We operate in 12 EU countries. Your dataset is the only one that covers all of them which has helped us out a lot in compliance especially because it's synthetic."
Every language has strong PII localization names, addresses, IDs, phone numbers, dates in the real format of that country.
23 languages. 29 regions. 3 scripts. 1,428,143 examples.
100% synthetic. Zero real personal data. Free on Hugging Face. reacted to theirpost with โค๏ธ about 16 hours ago
Things our clients and open source actually said to us this year:
"Finally, someone built a synthetic PII training data for German."
"Does it cover have localised information? Not just the language, the actual format. That must have been a lot of work that we can save from our side."
"We operate in 12 EU countries. Your dataset is the only one that covers all of them which has helped us out a lot in compliance especially because it's synthetic."
Every language has strong PII localization names, addresses, IDs, phone numbers, dates in the real format of that country.
23 languages. 29 regions. 3 scripts. 1,428,143 examples.
100% synthetic. Zero real personal data. Free on Hugging Face.