---

title: "AntClassify: An R Package for Standardized Classification of Ant Communities, Functional Guilds, Endemism, and Rarity"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{AntClassify: Classification of Ant Communities}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}

---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(AntClassify)
```

## Introduction

The AntClassify package provides an integrated ecological pipeline to classify ant communities into functional guilds, identify exotic species, detect endemic and rare species of the Atlantic Forest, and quantify key ecological patterns.

This tool was developed to facilitate ecological analyses, standardize functional classification, and improve reproducibility in ant community studies. By integrating multiple ecological databases into a single workflow, AntClassify allows researchers to efficiently assess community structure, biological invasions, endemism, and rarity patterns.

The package is particularly useful for biodiversity monitoring, conservation planning, and macroecological research involving ant assemblages.

AntClassify aims to provide a standardized and reproducible framework for advancing ecological research on ant communities.

## Example dataset

```{r}
dados <- data.frame(
  Atta_sexdens = 50,
  Camponotus_atriceps = 40,
  Crematogaster_sp = 35,
  Cyphomyrmex_minutus = 30,
  Cyphomyrmex_rimosus = 28,
  Ectatomma_edentatum = 25,
  Heteroponera_mayri = 22,
  Holcoponera_striatula = 20,
  Monomorium_floricola = 18,
  Monomorium_pharaonis = 17,
  Pheidole_megacephala = 16,
  Strumigenys_emmae = 15,
  Strumigenys_rogeri = 14,
  Nylanderia_fulva = 13,
  Odontomachus_chelifer = 12,
  Oxyepoecus_reticulatus = 11,
  Pachycondyla_striata = 10,
  Apterostigma_serratum = 9,
  Brachymyrmex_delabiei = 8,
  Brachymyrmex_feitosai = 7,
  Camponotus_fallatus = 6,
  Camponotus_hermanni = 5,
  Camponotus_xanthogaster = 4,
  Pheidole_aberrans = 3,
  Pheidole_fimbriata = 3,
  Pheidole_obscurithorax = 2,
  Pheidole_subarmata = 2,
  Strumigenys_fridericimuelleri = 2,
  Heteroponera_inermis = 2,
  Oxyepoecus_browni = 2,
  Sphinctomyrmex_stali = 1,
  Strumigenys_sanctipauli = 1,
  Brachymyrmex_micromegas = 1,
  Camponotus_tripartitus = 1,
  Diaphoromyrma_sofiae = 1
)

colnames(dados) <- gsub("_", " ", colnames(dados))

dados
```

## Running the pipeline

```{r}
resultado <- antclassify(dados, validate = FALSE, plot = FALSE)
```

## Accessing results

```{r}
names(resultado)

head(resultado$guilds$table)
resultado$exotics
resultado$endemics
resultado$rarity
```

## Using individual functions

Although `antclassify()` runs the full pipeline, users can also apply each function separately depending on their research goals.

### Functional guild classification

```{r}
guilds <- assign_guild_ants(dados, validate = FALSE, plot = FALSE)

head(guilds$table)
```

### Exotic species detection

```{r}
exotics <- check_exotic_ants(dados, validate = FALSE, plot = FALSE)
exotics
```

### Endemic species (Atlantic Forest)

```{r}
endemics <- check_endemic_atlantic_ants(dados, validate = FALSE, plot = FALSE)
endemics
```

### Rarity classification

```{r}
rarity <- check_rarity_atlantic_ants(dados, validate = FALSE, plot = FALSE)
rarity
```

##Multi‑site analysis with antclassify_community

When the study involves several sampling units, the function antclassify_community() applies the full classification pipeline to each row of a community matrix and returns aggregated site‑by‑guild information. This avoids manual loops and ensures consistent classification across all sites.

The package includes a small built‑in dataset, ant_community, that can be used to test this function.

```{r}
data(ant_community)

# Run the pipeline on the built‑in dataset
res_com <- antclassify_community(ant_community, guild_col = "antclassify_guild",
                                 validate = FALSE)

# Abundance matrix (sites × guilds)
res_com$guild_abundance

# Guild richness per site
res_com$guild_richness
```

## Input data format

The package expects a community matrix where:

Rows represent sampling units (or a single community)

Columns represent species

Values represent abundance (or presence/absence)

Species names must be provided as column names..

### Example structure

The built‑in dataset ant_community demonstrates the expected format:

```{r}
data(ant_community)
ant_community
```

## Importing data from external files

### CSV files

```{r eval=FALSE}
dados <- read.csv("data.csv", check.names = FALSE)
```

### TXT files

```{r eval=FALSE}
dados <- read.table("data.txt", header = TRUE, sep = "\t", check.names = FALSE)
```

### Excel files

```{r eval=FALSE}
# install.packages("readxl")
library(readxl)

dados <- read_excel("data.xlsx")
dados <- as.data.frame(dados)
```

### Important note

```{r eval=FALSE}
colnames(dados) <- gsub("_", " ", colnames(dados))
```

This step guarantees compatibility with the internal species name standardization used in **AntClassify**.

## Final considerations

The AntClassify package provides a flexible workflow that can be used either as a fully automated pipeline or through modular functions, allowing users to adapt analyses to different ecological questions.

By integrating functional classification, invasion biology, endemism, and rarity into a single framework, the package enhances reproducibility and facilitates ecological interpretation of ant communities.
