File Reader Tools¶
The FileReader tool provides multi-format file reading with line numbers, pagination, and safety caps. It supports plain text files, Jupyter notebooks, and PDFs.
Class Overview¶
FileReader- Four reading methods for different file formats:read()- Text files with line numbers and paginationread_notebook()- Jupyter notebooks (.ipynb)read_pdf()- PDF files (requires optional dependency)read_image()- Image files with multimodal content blocks (requires optional dependency)
Usage¶
Reading Text Files¶
from toolregistry_hub import FileReader
# Read a file with line numbers
content = FileReader.read("/path/to/file.py")
print(content)
# [/path/to/file.py] lines 1-50 of 200 (use offset=51 to read more)
# 1 | import os
# 2 | import sys
# 3 |
# 4 | def main():
# ...
# Read with pagination
content = FileReader.read("/path/to/file.py", offset=50, limit=25)
Reading Jupyter Notebooks¶
# Read notebook cells with type markers and outputs
content = FileReader.read_notebook("analysis.ipynb")
# [Notebook: analysis.ipynb]
#
# --- Cell 1 [markdown] ---
# # Data Analysis
#
# --- Cell 2 [code] ---
# ```python
# import pandas as pd
# df = pd.read_csv("data.csv")
# ```
# Output:
# ...
No external dependencies needed -- uses stdlib json.
Reading PDFs¶
# Read all pages (up to 20 page cap)
content = FileReader.read_pdf("document.pdf")
# Read specific page range
content = FileReader.read_pdf("document.pdf", pages="5-10")
# Read a single page
content = FileReader.read_pdf("document.pdf", pages="3")
Requires pypdf or pdfplumber:
If both are installed, pdfplumber is preferred for better text quality.
Reading Images¶
# Read an image — returns multimodal content blocks
blocks = FileReader.read_image("screenshot.png")
# [
# {"type": "text", "text": "[Image: screenshot.png (image/png, 45321 bytes)]"},
# {"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": "iVBOR..."}}
# ]
# With custom max size (default 5 MB base64)
blocks = FileReader.read_image("large_photo.jpg", max_size=1_000_000)
Supported formats: .png, .jpg, .jpeg, .gif, .webp.
If the base64-encoded image exceeds max_size, Pillow is used for adaptive quality downsampling. Requires Pillow:
If Pillow is not installed, the original image is returned with a warning logged.
Parameters¶
read()¶
| Parameter | Type | Default | Description |
|---|---|---|---|
path |
str |
required | Path to text file |
offset |
int |
1 |
Starting line number (1-indexed) |
limit |
int \| None |
None |
Max lines to read (default 2000) |
read_notebook()¶
| Parameter | Type | Default | Description |
|---|---|---|---|
path |
str |
required | Path to .ipynb file |
read_pdf()¶
| Parameter | Type | Default | Description |
|---|---|---|---|
path |
str |
required | Path to PDF file |
pages |
str \| None |
None |
Page range (e.g. "1-5", "3") |
read_image()¶
| Parameter | Type | Default | Description |
|---|---|---|---|
path |
str |
required | Path to image file |
max_size |
int |
5242880 |
Max base64-encoded size in bytes (5 MB) |
Safety Caps¶
- Text files: 10 MB max file size
- Text lines: 2000 lines default per read
- PDF pages: 20 pages max per call
- Notebook outputs: 10 KB per cell output
- Images: 5 MB max base64-encoded size (auto-downsampled if exceeded)
MCP Server Endpoints¶
POST /tools/reader/read
POST /tools/reader/read_pdf
POST /tools/reader/read_notebook
POST /tools/reader/read_image
API Reference¶
toolregistry_hub.file_reader.FileReader ¶
Multi-format file reader with line numbers and pagination.
read
staticmethod
¶
Read a text file with line numbers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to file. |
required |
offset
|
int
|
Starting line number (1-indexed). Defaults to 1. |
1
|
limit
|
int | None
|
Maximum number of lines to read. Defaults to 2000. |
None
|
Returns:
| Type | Description |
|---|---|
str
|
File content with line numbers in |
str
|
Includes a metadata header with file path, total lines, and |
str
|
the range actually read. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the file does not exist. |
IsADirectoryError
|
If the path is a directory. |
ValueError
|
If offset is less than 1. |
Source code in toolregistry_hub/file_reader.py
read_image
staticmethod
¶
Read an image file and return as multimodal content blocks.
Returns a list of content blocks (TextBlock + ImageBlock) that the
toolregistry pipeline can expand into format-specific multimodal
messages via expand_content_blocks().
If the base64-encoded image exceeds max_size, Pillow is used to
downsample it. If Pillow is not installed, the original image is
returned with a warning.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to image file (.png, .jpg, .jpeg, .gif, .webp). |
required |
max_size
|
int
|
Maximum base64-encoded size in bytes. Defaults to 5 MB. |
_MAX_IMAGE_SIZE_BYTES
|
Returns:
| Type | Description |
|---|---|
list
|
A list of two content blocks:: [ {"type": "text", "text": "[Image: name (mime, size)]"}, {"type": "image", "source": { "type": "base64", "media_type": "image/png", "data": "iVBOR..." }} ] |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the file does not exist. |
ValueError
|
If the file extension is not supported. |
Source code in toolregistry_hub/file_reader.py
read_notebook
staticmethod
¶
Read a Jupyter notebook and return formatted cell contents.
Uses stdlib json only — no external dependencies.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to |
required |
Returns:
| Type | Description |
|---|---|
str
|
All cells with type markers (code/markdown) and outputs. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the file does not exist. |
ValueError
|
If the file is not a valid notebook. |
Source code in toolregistry_hub/file_reader.py
read_pdf
staticmethod
¶
Read a PDF file and extract text.
Uses pypdf (zero-dependency, BSD) by default. If pdfplumber
is installed, uses it for better text quality.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to PDF file. |
required |
pages
|
str | None
|
Page range string (e.g. |
None
|
Returns:
| Type | Description |
|---|---|
str
|
Extracted text content with page markers. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the file does not exist. |
ImportError
|
If neither |
ValueError
|
If page range is invalid. |