Chapter 5: Data Agent and Smart Properties——AI automatically generates data

Section 5-1

What is Data Agent?—Autonomous data investigation and supplementation using AI

Data Agent is an AI agent feature that HubSpot announced at INBOUND 2025.In response to the instruction, "Look up this contact's industry and put it in the property," the AI autonomously investigates public information on the web, existing data in HubSpot, and related records, infers the optimal value, and writes it into the property.。

Traditionally, there were three ways to "fill in properties": (1) manual entry by the user, (2) collection via a form, and (3) calling an external enrichment service such as Clearbit via API. Data Agent is different from theseFourth method: AI infers and autofillsis.

🤖 Data Agent operation flow

🔔

workflow
trigger

Create contact/
Update/Schedule

→

🤖

Data Agent

AI uses multiple sources
Autonomous investigation and inference

→

🌐

Web survey

Official website・
LinkedIn/Industry DB

🟡

Reference within CRM

Affiliated companies/Business negotiations/
Engagement history

→

✍️

Smart Property
write to

Value + confidence score
＋Record the basis of the investigation

Consumes credits per run (Pro: 5,000/month / Enterprise: 10,000/month)

If the reliability score is less than the set value, it can be skipped without writing.

There is an option to leave the investigation grounds (source URL, etc.) as an internal memo.

Three problems that Data Agent solves

problem	Conventional measures	Data Agent solution
Large number of blanks in important fields	Manually researched and entered by interns/SDRs. 3-5 minutes per case, 50-80 hours for 1,000 cases	Automatically investigate and supplement thousands of items with a single workflow. Zero human cost
Cost of external enrichment	Clearbit/ZoomInfo starts at $5,000/year. Even if the data becomes old, the update frequency is limited.	Run with HubSpot credits. Pinpoint completion of required fields at the required timing
Non-standardized data entry	Industry field where "IT", "information technology" and "technology" are mixed	AI selects and inputs the optimal value from HubSpot's standard options, so there is no spelling fluctuation.

💡 How to use with external enrichments such as Clearbit

// ① Industry score (0-30 points)Speed/coverageis its strength. Data Agent uses AI to investigate the latest information on the web in real time.Information not found in the database and keeping up with the latest changesis its strength. Many RevOps teams choose a combination of ``Clearbit for basic information for large companies, and Data Agent for small and medium-sized enterprises, startups, and detailed information.''

Section 5-2

Design and configure Smart Properties—which fields to leave to AI?

Smart Properties are "dedicated properties to which the Data Agent writes values." Unlike normal properties,Sync to your data warehouse in near real-timeA set of 4 points will be recorded. Create by selecting the "Smart Property" type on the HubSpot property settings screen.

Review of duplicate candidates

✓ Fields that should be left to AI

Industry——AI infers from company website/LinkedIn. Selects the best option from standard options

Company size (Employee Count)——Estimated from the footer and press release of LinkedIn and the official website. Can also be used to automatically update when existing data is old

Tech Stack——Estimating the technology used from recruitment information, GitHub, and job postings. To understand competitive/complementary tools such as whether Salesforce is used or not

Exclude unnecessary rows (e.g. only Lifecycle Stage = Customer, only amount > 0)——Obtain the country and city from the “Company Profile” and “Contact Us” pages of the official website

Funding Stage——Get the latest procurement information from Crunchbase, TechCrunch, etc.

Basis for calculating ICP scoreLEFT JOIN (recommended default)

Occurs when the code is executed on a record where the property declared in Input Fields is blank. Workaround: Set default values for all input values using "|| ''" or "|| 0". Always check for null before processing.

email address——Registering the value guessed by AI will lead to incorrect delivery and spam reports. Obtained only from forms or verified sources

mobile number——Privacy information. Inferences made by AI are at risk of violating the Personal Information Protection Act and GDPR.

Financial figures (sales/profit)——Financial information of unlisted companies is not made public and estimation accuracy is low. There is a high risk that incorrect numbers will affect sales decisions.

Contract details/amount——Accurate data from internal systems is required. AI can never guess

Legal information (registration information, etc.)——Legally valid data must be obtained directly from official sources

Smart Property sample—actual display image

🏭

Industry (Smart Property)

AI input

current value

Software & Technology

confidence score

92 / 100

Last survey date

March 7, 2026

📎 Investigation basis (AI memo)

The official website (example.com) has a description of "Cloud-based SaaS" in its header. Classified in the "Software Development" category on my LinkedIn profile. We confirmed that many "engineer positions" were posted on the recruitment information page.

👥

Number of employees (Smart Property)

AI input

current value

201-500 people

confidence score

78 / 100

Last survey date

March 7, 2026

📎 Investigation basis (AI memo)

LinkedIn's "About" page displays "201-500 employees." The official press release (December 2025) mentions "over 300 employees." However, since the latest information has not been confirmed, the reliability is medium.

💻

Technology stack (Smart Property)

AI input

current value

Salesforce, AWS, React, Python

confidence score

85 / 100

Last survey date

March 6, 2026

📎 Investigation basis (AI memo)

AWS, React, and Python are clearly indicated in the "Technology used" column on the recruitment information page. Wappalyzer-equivalent technology detection confirms use of Salesforce. Match on LinkedIn engineer profile.

💰

Funding stage (Smart Property)

Not investigated

current value

(not entered)

confidence score

—

investigation status

Skip because confidence is less than threshold (80)

🤖 Investigate now with Data Agent

Confidence threshold settings for Smart Property

For each Smart Property, you can select ``Only write when the reliability level is higher than what point.''Confidence Thresholdcan be set. This allows for quality control such as ``entering only reliable information into the CRM, and skipping uncertain information and leaving it blank.''

Threshold setting	Explains everything from design to implementation.	Recommended fields
High threshold (80-90+)	Write only information that is highly reliable. Skip rate is higher, but written values are more accurate	Fields used for workflow trigger conditions/Fields used for segment conditions (industry/company size)
Medium threshold (60-79)	Write with some degree of certainty. Used with the assumption that the person in charge will confirm it later.	Fields to be displayed as reference information ・Fields for humans to check and correct AI suggestions
Low threshold (less than 60)	Write positively. Prioritize coverage over accuracy	Basically not recommended. There is a risk that automation will malfunction due to incorrect data.

Section 5-3

Incorporating it into your workflow—how to use the “Fill Smart Property” action

What actually runs the Data AgentWorkflow “Fill Smart Property” actionis. Simply add this action to your workflow and the Data Agent will automatically investigate and complete the records that meet the trigger conditions.

⚙️ Workflow design: Auto-complete Smart Properties when creating contacts

Data Agent automatically investigates the industry, company size, and technology stack and writes it to the CRM.

🔔

trigger

A contact is created (whether via form, API, or import)

Join key: hs_deal_id = sf_deal_id (already mapped in Salesforce integration). JOIN type: FULL OUTER JOIN (to detect records that exist in either side)

⏱️

Step 1

Wait 1 minute

By running it a little later than immediately after a record is created, agents can inspect it with all form input values saved.

🤖

Step 2 — Fill Smart Property action

Complement Industry with Data Agent

Smart Property: industry_smart / Confidence threshold: 80 / Skip condition: Skip if industry (manually entered) is not blank

Target property: industry_smart Threshold: 80 / 100 If there is an existing value: Never Overwrite

🤖

Step 3 — Fill Smart Property action

Supplement Employee Count with Data Agent

Smart Property: employee_count_smart / Confidence threshold: 75 / If there is an existing value, it can be overwritten after confirmation

🤖

Step 4 — Fill Smart Property action

Complement your tech stack with Data Agent

Smart Property: tech_stack_smart / Confidence threshold: 70 / Stored in text type property separated by commas

📊

Step 5 — Conditional branching

Calculate ICP score based on imputed values

Branch condition: industry_smart is "Software & Technology" or "Finance" AND employee_count_smart is "51-200" or more → Set ICP score to "High".

if industry_smart IN ["Software & Technology","Finance"] AND employee_count_smart >= "51-200" → Contact property icp_tier = "High" → Create notification task to contact owner

How to complete "existing large number of blank records" at once

The above workflow is a "on new creation" trigger, so it won't apply to the thousands of blank records that already exist. There are two ways to complete existing records at once.

Method 1: Workflow “Apply to past records”—— From “Register existing contacts” in the workflow settings, specify the target past records (e.g. “Industry is blank and BtoB email”) and register them in bulk. Calculate the amount of credits consumed before execution (1 to 3 credits per item).

Method 2: Batch completion from the “Missing Data” tab in Command Center——The “Fill with Data Agent” button may be displayed on the “Missing Data” tab of the Data Quality Command Center introduced in Chapter 2. Check the target property and number of items before executing.

⚠️ Be sure to perform a credit trial calculation before completing the bulk supplementation.

Complementing 10,000 contacts with 3 Smart Properties can consume up to 30,000 credits. Significantly exceed the monthly limit (5,000 credits) of the Professional plan.First, run a test on 100 items → Check accuracy and credit consumption → Expand graduallyThat's the safe way to proceed.

Section 5-4

Credit management, accuracy, and governance—Risks and countermeasures for using AI data

Credit structure and management

The Data Agent runs in HubSpot.Consume “AI Credits”It is a pay-as-you-go type. Credits need to be properly managed, as they are shared not only with Smart Properties but also with AI assistants and future AI features.

💳 AI Credit Management Guide

Professional

5,000

Credits/month (included)

Enterprise

10,000

Credits/month (included)

Additional purchase

+α

Can be added in units of 1,000 credits

Main consumption actions and estimated costs

Smart Property Completion (No web research/CRM information only)

Approximately 1 credit/time

simple reasoning

Smart Property Completion (with web research/standard)

Approximately 2-3 credits/times

most common

Smart Property Completion (Deep Web Research/Multiple Source Reference)

Approximately 5-8 credits/times

Technology stack etc.

AI formula suggestions in Data Studio (Chapter 4)

Approximately 1-2 credits/time

Formula generation only

Location to check remaining amount

Settings → Account → Usage → AI Credits

Real-time confirmation possible

AI data accuracy—understand the expected value of each field

Once you create a custom object and start filling it with data, it becomes difficult to change the object name, delete properties, or change associations.“High accuracy = can be used as is” “Low accuracy = reference level”It is important to design with this understanding in mind.

🏭

Industry

88〜92%

In many cases, it can be clearly determined from the description on the official website or LinkedIn. The accuracy is particularly high for listed companies and major companies. Accuracy is lower for sole proprietorships and freelancers.

👥

Number of employees (range)

80〜88%

High accuracy if you can refer to official information on LinkedIn. However, the answer will be in the range of 201 to 500 people. Accurate figures are difficult to obtain if the company is not listed.

🌍

Head office location (country/city)

90〜95%

This category is often specified in the footer or Contact Us page of the official website, and has the highest accuracy. A multinational company's "global headquarters" may contain errors.

💻

technology stack

72〜82%

Estimated from recruitment information and public information on GitHub. Startups and development companies have a lot of information. Accuracy decreases for non-IT companies and small and medium-sized enterprises with little information.

💰

Funding stage

65〜78%

Startups whose information is published on sites such as TechCrunch and Crunchbase are highly accurate. Accuracy decreases for companies whose information is not public or when only old procurement information is available.

📊

Annual sales (estimated)

40〜60%

Since the financial information of unlisted companies is not made public, the accuracy of estimates is greatly reduced. It would be improved if there was article information from the Wall Street Journal, etc., but I think the accuracy is at a reference level.

AI data governance design

① Be sure to record what “AI wrote”——The “AI input flag” is automatically set for Smart Property, but the flag is not set when custom code is written to the property. Design to clearly distinguish between AI-generated data and human-input data.

② Use only high-precision fields as branch conditions in the workflowObtain snapshots of all items on a daily basis. Used for monthly reports, annual comparisons, and machine learning learning data that refer to data as of a specific date in the past. Cheaper than V2_LIVE and more stable queries.

③ Conduct regular accuracy auditsWrite back scores, classifications, and predicted values calculated by BigQuery/Snowflake to HubSpot properties. ML model inference results and advanced aggregate values can be used as triggers for HubSpot automation.

✅ Recommended steps for Data Agent deployment

① First, conduct a small pilot of 100 cases targeting only two fields: Industry and Headquarters Location / ② Observe the accuracy, credit consumption, and impact on business for two weeks / ③ If there are no problems, add target fields and expand the number of cases / ④ Incorporate monthly accuracy audits and credit consumption reports into cadence - You can safely introduce it into production with these 4 steps.

📌 Chapter 5 Summary

Data Agent is the third option for “manual research/external enrichment”

Following manual input, form collection, and external enrichment APIs, ``AI-based autonomous web research, inference, and automatic input'' has been added as a new method. Clearbit and others are database-type and fast, while Data Agent is a web real-time investigation type and is strong in the latest information. The real solution is to use a combination of both.

Smart Property is recorded as a 3-point set of "value, reliability, and basis"

The biggest difference from regular properties is that the confidence score and research basis are saved together. You can control the quality by setting a threshold (such as writing only over 80). The basic rule is to use a high threshold for fields used for workflow automation conditions, and a low threshold for fields used for reference information.

Clearly separate fields to be entrusted to AI and fields not to be entrusted to AI

The industry, company size, location, technology stack, and funding stage are suitable for AI. Never leave your email address, phone number, financial figures, or contract information to AI. Make decisions by thinking about what will happen if an incorrect value is used for automation. Regarding personal information, consider the risks of GDPR and the Personal Information Protection Act.

Calculate credit consumption first and then apply on a large scale

The Professional plan's 5,000 credits per month can be exceeded in an instant if you complete 3 properties out of 10,000. Be sure to check the unit credit price with 100 pilots and estimate the monthly consumption before applying it to production. Additional credits can be purchased, but be careful not to exceed your budget.

Next Chapter

Chapter 6: Data Warehouse Integration (Enterprise)—Bidirectional integration with Snowflake/BigQuery →

Best use: When you want to close all related deals/tickets at once when a customer cancels.
AI automatically generates data

📋 Contents of this chapter

What is Data Agent?—Autonomous data investigation and supplementation using AI

Three problems that Data Agent solves

Design and configure Smart Properties—which fields to leave to AI?

Review of duplicate candidates

Smart Property sample—actual display image

Confidence threshold settings for Smart Property

Incorporating it into your workflow—how to use the “Fill Smart Property” action

How to complete "existing large number of blank records" at once

Credit management, accuracy, and governance—Risks and countermeasures for using AI data

Credit structure and management

AI data accuracy—understand the expected value of each field

AI data governance design

📌 Chapter 5 Summary

Data Agent is the third option for “manual research/external enrichment”

Smart Property is recorded as a 3-point set of "value, reliability, and basis"

Clearly separate fields to be entrusted to AI and fields not to be entrusted to AI

Calculate credit consumption first and then apply on a large scale

Best use: When you want to close all related deals/tickets at once when a customer cancels.AI automatically generates data

📋 Contents of this chapter

What is Data Agent?—Autonomous data investigation and supplementation using AI

Three problems that Data Agent solves

Design and configure Smart Properties—which fields to leave to AI?

Review of duplicate candidates

Smart Property sample—actual display image

Confidence threshold settings for Smart Property

Incorporating it into your workflow—how to use the “Fill Smart Property” action

How to complete "existing large number of blank records" at once

Credit management, accuracy, and governance—Risks and countermeasures for using AI data

Credit structure and management

AI data accuracy—understand the expected value of each field

AI data governance design

📌 Chapter 5 Summary

Data Agent is the third option for “manual research/external enrichment”

Smart Property is recorded as a 3-point set of "value, reliability, and basis"

Clearly separate fields to be entrusted to AI and fields not to be entrusted to AI

Calculate credit consumption first and then apply on a large scale

Best use: When you want to close all related deals/tickets at once when a customer cancels.
AI automatically generates data