# XML Feeds

* (Most common) Large XML feed for your overall, chronologically-ordered job feed
* Different XML feeds per source JobFront tracks for you
* Smaller XML feeds for different categories (job function, location, etc)

**How it works**

* If you are ingesting different XML feeds for each source JobFront tracks for you (usually for job boards), once you get access to the JobFront data playground app you can search and retrieve different sources and see the XML feed link directly in the interface that you can copy into your own XML ingestion program
* If you are ingesting a single large XML feed from many sources in chronological order (either for large job boards, or for sales intelligence tools), we will usually give you an API token that you can use to generate a signed S3 bucket url. You can then use that authenticated S3 bucket URL to download the XML feeds that we provide from S3
  * You will be able to retrieve the full active jobs index, along with delta files (removed and added jobs in the past day)

## **Job Posts - XML Format**

```
<?xml version="1.0" encoding="UTF-8"?>
<jobs>
  <job>
    <id>Ja3f5322637f741200oeldu9c2ad2fbfe</id>
    <title>Sr. Director/Executive Director, Analytical Development</title>
    <description>Lead analytical development for late-stage cardiorenal small molecule programs</description>
    <post> FULL JOB POST HTML </post>
    <post_language>en</post_language>
    <url_job>https://job-boards.greenhouse.io/bridgebio/jobs/XXX</url_job>
    <work_model>on_site</work_model>
    <commitment>full_time</commitment>
    <level>expert</level>
    <benefits><benefit>401k participation</benefit></benefits>
    <requirements><requirement>Requires PhD</requirement></requirements>
    <responsibilities><responsibility>Work in the lab to develop various programs</responsibility></responsibilities>
    <salary>
      <min>60000</min>
      <max>80000</max>
      <currency>USD</currency>
      <period>year</period>
    </salary>
    <created>2025-04-03T12:37:45</created>
    <created_at>1743683865</created_at>
    <locations><location><text></text><city></city><state></state><country></country><zip></zip><latitude></latitude><longitude></longitude><cbsas><cbsa><title></title><code></code></cbsa></cbsas></location></locations>
    <tags><tag>Scientist</tag></tags>
    <source>
      <id>S1b6ad6b3d4ec4fa3ac7b5b98fa5c1486</id>
      <name>BridgeBio Pharma</name>
      <description>BridgeBio Pharma develops transformative medicines for genetic diseases and cancers with genetic drivers.</description>
      <domain>bridgebio.com</domain>
      <url_logo>https://jobfront-public.s3.us-west-2.amazonaws.com/sources/logos/S1b6ad6b3d4ec4fa3ac7b5b98fa5c1486.jpg</url_logo>
    </source>
  </job>
<jobs>
```

### Root Elements

* \<?xml version="1.0" encoding="UTF-8"?>: XML declaration
* \<jobs>: Root element containing all job entries

### Job Entry

Each job is enclosed in a \<job> tag and contains the following fields:

#### Basic Job Information

* \<id> Unique identifier for the job
* \<title> Title of the job position
* \<commitment> Type of job commitment
  * Options: full\_time, part\_time, internship, contract, temporary, volunteer, other
* \<description> Brief 1-line description of the job
* \<post> Detailed job posting in HTML format
* \<post\_language> Most likely 2-character language
  * Note: Most common language will be lowercase "en" - english but we do occasionally see jobs from dozens of other languages, indicated by their 2-character names
  * Note: In some cases job posts contain many languages, and we will provide a single predominate language
* \<level> Level of the job position (e.g., entry, senior)
  * Options: internship, entry\_level, junior, mid\_level, senior, expert ("" blank, if not detected)
* \<benefits> List of freetext benefits offered with the job
  * List of \<benefit> tags with freetext benefits described in each
* \<requirements> List of requirements for the job
  * List of \<requirement> tags with freetext requirements described in each
* \<responsibilities> Responsibilities associated with the job
  * List of \<responsibility> tags with freetext responsibilities described in each

#### URLs

* \<url\_job> Origin URL for the job post

#### Dates

* \<created> Date the job was seen (usually when we scraped this job for the first time, although sometimes we can determine that the job was posted in the past and we use that more precise historical timestamp)
  * Format: YYYY-MM-DD and UTC time
    * ```
      <created>2025-04-03T12:37:45</created>
      ```
* \<created\_at> Timestamp when the job was created
  * Format: Unix timestamp
    * ```
      <created_at>1743683865</created_at>
      ```

#### Salary Information

\<salary> Salary information enclosing the following subtags:

* \<min> Minimum salary
* \<max> Maximum salary
* \<currency> 3-character currency code
  * ```
    <salary><currency> - ['USD','AED','CZK','BTC', ...]
    ```
* \<period> Pay period (default is "year")
  * ```
    <salary><period> - ['year','month','week','biweek','day','hour','minute']
    ```

#### Locations

\<locations> List of structured location information

* List of \<location> tags containing the following subtags:
* \<text> City, State, Country of the job location (includes all available entries)
* \<city> City of the job location
* \<state> State or region of the job location
* \<country> Country of the job location
* \<zip> Zip code of the job location
* \<latitude> Lat of the job location
* \<longitude> Long of the job location
* \<cbsas> List of \<cbsa> tags that contain both cbsa \<code> and \<title>
  * ```
    <locations><location><cbsas><cbsa><code> - CBSA code
    <locations><location><cbsas><cbsa><title> - CBSA title
    ```

#### Source Information

* \<name> Name of the job source, organization, or company
* \<description> 1-line description of the job source
* \<domain> Naked domain of the source, for example “google.com”
* \<url\_logo> Image/logo url hosted by JobFront

### Formatting Notes

1. All text content is escaped using XML entity encoding to prevent XML parsing errors.
2. Numeric fields (like salary) are not enclosed in quotes.
3. Empty fields are represented by self-closing tags or empty string values.
4. Lists (like locations, tags, industries) are represented by multiple child elements within a parent element.

***

## **Intent Signals - XML Format**

```
<?xml version="1.0" encoding="UTF-8"?>
<signals>
  <signal>
    <id>P000ca7e5-63b0-4e6e-8577-5180163f069e</id>
    <category>finance</category>
    <type>cash-flow-constraints</type>
    <recency>active</recency>
    <severity>medium</severity>
    <match>medium</match>
    <classification>inferred</classification>
    <text>The company appears to be facing challenges with financial liquidity and cash flow management.</text>
    <evidences></evidences>
    <jobs>
      <job><id>J9f918b793cce4e079dd06e39400ad203</id><title>Service Technician/ Diesel Mechanic</title><url>https://recruiting.ultipro.com/wdl1000/JobBoard/05fea415-1d4a-a06d-1bca-d0abf475250e/OpportunityDetail?opportunityId=7df7ed7f-fc58-4673-8235-e1c853184d6d</url></job>
      <job><id>J84e747df6bba4573a223df4a5d869409</id><title>Accounting Clerk</title><url>https://recruiting.ultipro.com/wdl1000/JobBoard/05fea415-1d4a-a06d-1bca-d0abf475250e/OpportunityDetail?opportunityId=53b33c5e-105f-4ce6-846a-1d32a8be1d19</url></job>
      <job><id>Jcc9d6dc40ab24e0f9b17c28776992bc6</id><title>Staff Accountant</title><url>https://recruiting.ultipro.com/wdl1000/JobBoard/05fea415-1d4a-a06d-1bca-d0abf475250e/OpportunityDetail?opportunityId=8ee5e341-b207-420e-aff3-b6e4323d8c70</url></job>
    </jobs>
    <created>2025-04-12T22:41:14</created>
    <created_at>1744512074</created_at>
    <start_at>1738790452</start_at>
    <recent_at>1740766553</recent_at>
    <source>
      <id>S9c4a6ed8d2a34fc2832d2eb42b27b2f7</id>
      <name>Allstate Peterbilt Group</name>
      <description>A dealership group offering sales, parts, and service for Peterbilt trucks and other commercial vehicles across multiple locations.</description>
      <domain>allstatepeterbilt.com</domain>
      <url_logo>https://jobfront-public.s3.us-west-2.amazonaws.com/sources/logos/S9c4a6ed8d2a34fc2832d2eb42b27b2f7.jpg</url_logo>
    </source>
  </signal>
```

### Root Elements

* \<?xml version="1.0" encoding="UTF-8"?>: XML declaration
* \<signals>: Root element containing all signal entries

### Signal Entry

Each signal is enclosed in a \<signal> tag and contains the following fields:

#### Basic Signal Information

* \<id> Unique identifier for the signal (from problem\_id)
* \<category> Category of the signal (from problem\_category)
* \<type> Type of the signal (from problem\_type)
* \<recency> Recency status of the signal (from problem\_recency)
* \<severity> Severity level of the signal (from problem\_severity)
* \<match> Match information for the signal (from problem\_match)
* \<classification> Classification of the signal (from problem\_classification)
* \<text> Descriptive text of the signal (from problem\_text)

#### Evidence Information

Enclosed in \<evidences> tags:

* Multiple \<evidence> tags: Each containing evidence supporting the signal

#### Associated JobsJobFront - XML Job Object Field Documentation V3JobFront - XML Job Object Field Documentation V3

Enclosed in \<jobs> tags:

* Multiple \<job> tags, each containing:
* \<id>: Unique identifier for the job (from job\_id)
* \<title>: Title of the job (from job\_title)
* \<url>: URL for the job posting (from url\_job)

#### Dates

* \<created>: Formatted timestamp when the signal was created (YYYY-MM-DDThh:mm\
  )
* \<created\_at>: Unix timestamp of when the signal was created
* \<start\_at>: Unix timestamp of when the signal started (from problem\_start\_at)
* \<recent\_at>: Unix timestamp of the most recent signal occurrence (from problem\_recent\_at)

#### Source Information

Enclosed in \<source> tags:

* \<id>: Unique identifier for the source (from source\_id)
* \<name>: Name of the source (from source\_name)
* \<description>: Description of the source (from source\_description)
* \<domain>: Domain URL of the source (from source\_url)
* \<url\_logo>: URL for the source's logo image (from source\_url\_logo)

***

## **Retrieving XML Files**

XML feeds are stored in an S3 bucket. To retrieve the S3 bucket URL:

* Aggregate feed URLS will be shared with you directly, as they are typically unique per customer
* Per-source feed urls are available to retrieve via the app (when logged in)

**Authentication**

Most XML feeds require authentication. We enforce authentication via AWS Signed URL's to access the S3 buckets

* JobFront will provide you with an API token
* Submit your API token to our api to retrieve a signed download URL to your S3 bucket

```
Authorization: Bearer <api_token>
curl https://api.jobfront.io/v2/<endpoint> -H "Authorization: Bearer <api_token>"

where <endpoint> is:

/v2/xml/jobs - jobs
/v2/xml/signals - signals
```

More explicitly for both Jobs and Signals XML feeds, first authorize via the below API calls to retrieve available signed URLS:

```
curl https://api.jobfront.io/v2/xml/jobs -H "Authorization: Bearer <api_token>"
curl https://api.jobfront.io/v2/xml/signals -H "Authorization: Bearer <api_token>"
```

API Returns signed URL's for different XML files. Note that each field returns a list of URL's because for some accounts these files are too large (10's of GB) to manage and we split these datasets into multiple files.

```
{
    "signed_urls_all":["https://aws.amazon.com/..."],
    "signed_urls_added":["https://aws.amazon.com/..."],
    "signed_urls_removed":["https://aws.amazon.com/..."]
}
```

Please reach out to us and we'll get you set up <hello@jobfront.io>.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.jobfront.io/jobs-data-file-exports/xml-feeds.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
