> For the complete documentation index, see [llms.txt](https://docs.cascade.io/cascade/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.cascade.io/cascade/tools/cleaning/sample.md).

# Sample

Sample offers the ability to view and use a subset of your dataset. You can select this subset of your data using any of the options below.&#x20;

### Input/Output

| Inputs                                                                                                                                                                                                                                                          | Outputs                                                                                                                                    |
| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
| <p><strong>Select Rules</strong> - rules to choose from for sample</p><p><strong>Value, N</strong> - N value in rules descriptions (number)</p><p><strong>Group By (optional) -</strong> column(s) to group by in table. N rows are returned for each group</p> | Table with a subset of the data from your original input dataset.  You can select this subset of your data using any of the options below. |

### Options

| Option            | Description                                                                 |
| ----------------- | --------------------------------------------------------------------------- |
| First N rows      | Returns every row in the data from the beginning of the data through row N. |
| Last N rows       | Returns the last N rows of the data.                                        |
| Skip first N rows | Returns all rows in the data starting after row N.                          |
| 1 every N rows    | Returns the first row of every group of N rows.                             |
| First N% rows     | Returns N percent of rows.                                                  |

You also have the option to use a groupby that will apply the above rules to each group or rows in the groupby.&#x20;

### Example

Let's say we have a list of baseball players that we want to use a sample of:

{% embed url="<https://datawrapper.dwcdn.net/6N2lX/1/>" %}

We only want to see the first 5 players in the list. So we use the `First N rows` and set `N = 5`We get the following result:

{% embed url="<https://datawrapper.dwcdn.net/eO2Ph/1/>" %}

### Example 2

Now we want to only use the first 20% emails in the list. So we use the `First N% rows` and set `N = 20`We get the following result:

{% embed url="<https://datawrapper.dwcdn.net/bRuH5/1/>" %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.cascade.io/cascade/tools/cleaning/sample.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
