charlatan
makes fake data, inspired from and borrowing some code from Python’s faker
Why would you want to make fake data? Here’s some possible use cases to give you a sense for what you can do with this package:
R6
objects that a user can initialize and then call methods on. These contain all the logic that the below interfaces use.ch_*()
that wrap low level interfaces, and are meant to be easier to use and provide an easy way to make many instances of a thing.ch_generate()
- generate a data.frame with fake data, choosing which columns to include from the data types provided in charlatan
fraudster()
- single interface to all fake data methods, - returns vectors/lists of data - this function wraps the ch_*()
functions described aboveStable version from CRAN
install.packages("charlatan")
Development version from Github
devtools::install_github("ropensci/charlatan")
library("charlatan")
… for all fake data operations
x <- fraudster()
x$job()
#> [1] "Astronomer"
x$name()
#> [1] "Veronica Spencer"
x$job()
#> [1] "Marine scientist"
x$color_name()
#> [1] "Orchid"
Adding more locales through time, e.g.,
Locale support for job data
ch_job(locale = "en_US", n = 3)
#> [1] "Social researcher" "Oncologist" "Sports therapist"
ch_job(locale = "fr_FR", n = 3)
#> [1] "Chargé des relations publiques"
#> [2] "Commissaire de police"
#> [3] "Technicien de l'intervention sociale et familiale"
ch_job(locale = "hr_HR", n = 3)
#> [1] "Prvostupnik fizioterapije" "Galanterist"
#> [3] "Ljekarnik"
ch_job(locale = "uk_UA", n = 3)
#> [1] "Музикознавець" "Євнух" "Драматург"
ch_job(locale = "zh_TW", n = 3)
#> [1] "農藝作物栽培工作者" "人力資源主管" "醫療人員"
For colors:
ch_color_name(locale = "en_US", n = 3)
#> [1] "GreenYellow" "MediumAquaMarine" "Linen"
ch_color_name(locale = "uk_UA", n = 3)
#> [1] "Темно-оливковий" "Шкіра буйвола" "Карміновий"
More coming soon …
ch_generate()
#> # A tibble: 10 x 3
#> name job phone_number
#> <chr> <chr> <chr>
#> 1 Judy Sporer Camera operator 1-367-536-0011
#> 2 Marquis Mohr Fast food restaurant manager (542)769-1599x518
#> 3 Ludwig Zemlak Operations geologist 922.974.3789x94351
#> 4 Jocelynn Conroy-Mante Multimedia programmer 299.424.0069x6296
#> 5 Mr. Kale Hettinger DDS Research scientist (maths) (540)764-1898x22536
#> 6 Georgeann Hermiston Forensic psychologist 1-508-036-8142
#> 7 Kisha Auer Housing manager/officer 09031038228
#> 8 Mannie Bauch Hydrogeologist 1-520-018-4070
#> 9 Mr. Solomon Stokes Seismic interpreter (677)625-4925x24154
#> 10 Aaron Koss Counsellor 693.127.2970x147
ch_generate('job', 'phone_number', n = 30)
#> # A tibble: 30 x 2
#> job phone_number
#> <chr> <chr>
#> 1 Scientific laboratory technician (850)989-7631x97561
#> 2 Health and safety inspector 1-331-469-3494x07155
#> 3 Scientist, audiological (949)866-2603x6495
#> 4 Engineer, site 916-028-1084x07115
#> 5 Advertising account executive (003)000-9606x58336
#> 6 Doctor, general practice 1-515-692-0101
#> 7 Special effects artist 302.750.6226
#> 8 Film/video editor (815)318-0611x01524
#> 9 Personal assistant 1-239-383-1485
#> 10 Psychotherapist 451.956.1823x6932
#> # ... with 20 more rows
ch_name()
#> [1] "Mr. Kaiden Sawayn V"
ch_name(10)
#> [1] "Mrs. Zandra Hackett" "Elyssa Gulgowski"
#> [3] "Dr. Tyrone Ferry" "Bonnie Hagenes"
#> [5] "Mr. Romeo Zieme DVM" "Lavona Stroman"
#> [7] "Ms. Tayler Haag" "Gaige Bernier"
#> [9] "Hazen Vandervort" "Gerold Cummerata-Grimes"
ch_phone_number()
#> [1] "202.983.9204"
ch_phone_number(10)
#> [1] "1-206-722-5481x1441" "1-411-619-5450x3244" "853.148.3140"
#> [4] "780.223.8724x856" "319-563-9854x07865" "05226231709"
#> [7] "1-668-619-9351" "(528)949-0545x883" "814.730.1022x316"
#> [10] "107-705-6803x34432"
ch_job()
#> [1] "Environmental health practitioner"
ch_job(10)
#> [1] "Loss adjuster, chartered"
#> [2] "Television camera operator"
#> [3] "Conservation officer, historic buildings"
#> [4] "Hydrogeologist"
#> [5] "Production assistant, radio"
#> [6] "Site engineer"
#> [7] "Research scientist (maths)"
#> [8] "Barrister"
#> [9] "Pharmacologist"
#> [10] "Amenity horticulturist"
ch_credit_card_provider()
#> [1] "Discover"
ch_credit_card_provider(n = 4)
#> [1] "VISA 16 digit" "Discover" "Maestro" "JCB 16 digit"
ch_credit_card_number()
#> [1] "502041135452732"
ch_credit_card_number(n = 10)
#> [1] "3405222329210947" "3112783185875830541" "4856493326911841"
#> [4] "52194291540008498" "51436917795410462" "3055719091956532"
#> [7] "4248474110967" "3028600541934690" "3018386127265689"
#> [10] "639096780650568"
ch_credit_card_security_code()
#> [1] "781"
ch_credit_card_security_code(10)
#> [1] "880" "0838" "249" "492" "045" "943" "5684" "478" "385" "435"
Real data is messy, right? charlatan
makes it easy to create messy data. This is still in the early stages so is not available across most data types and languages, but we’re working on it.
For example, create messy names:
ch_name(50, messy = TRUE)
#> [1] "Garnet Kub" "Larue Cormier"
#> [3] "Dr. Elden Monahan Jr." "Male Kreiger-Blanda"
#> [5] "Alden Daugherty I" "Claude Cartwright"
#> [7] "Jelani Ziemann-Huel" "Malaya Swaniawski"
#> [9] "Orrie Morar Jr." "Dr. Phoenix Haley"
#> [11] "Delwin Hoeger" "Lora Rowe"
#> [13] "Faye Stoltenberg-Rutherford" "Wellington Stehr-Hudson"
#> [15] "Ms. Mckenna Block" "Dawna Barton"
#> [17] "Dr. Barnard Beer V" "Naoma Beahan"
#> [19] "Harrell Fisher" "Killian Hyatt"
#> [21] "Arta Parisian" "Mr. Boyce Gottlieb PhD"
#> [23] "Weaver Gleason" "Chaim Larson"
#> [25] "Danniel Davis V" "Tylor Wiza"
#> [27] "Mrs. Kenzie Bergnaum md" "Dr. Madilyn VonRueden d.d.s."
#> [29] "Fairy Goldner" "Rolanda Fadel-Mohr"
#> [31] "Sherryl Kertzmann PhD" "Rylee Keeling-Kreiger"
#> [33] "Mr. Jaxson Crist IV" "Cris Wiegand"
#> [35] "Miss Willodean Stokes dvm" "Jamaal Wyman-Kris"
#> [37] "Lindsey Ebert-Dooley" "Dr. Santo Treutel"
#> [39] "Jonna Murphy" "Holly Erdman"
#> [41] "Mr. Theophile Flatley" "Logan Hartmann"
#> [43] "Gil Wiegand-O'Reilly" "Beau Anderson V"
#> [45] "Tobias King" "Louie Little"
#> [47] "Constance Cummings" "Kamila Wilderman"
#> [49] "Otis Anderson" "Pluma Runolfsdottir"
Right now only suffixes and prefixes for names in en_US
locale are supported. Notice above some variation in prefixes and suffixes.