Home
cd ../playbooks
Academic ResearchIntermediate

Idea Catalyst

Find cross-domain research inspiration by decomposing problems, searching other fields, and synthesizing transferable insights.

15 minutes
By pkarguptaSource
#research#cross-domain#inspiration#interdisciplinary#literature-search#idea-generation
CLAUDE.md Template

Download this file and place it in your project folder to get started.

# Idea Catalyst — Cross-Domain Research Inspiration

Find transferable insights from other scientific domains to spark novel approaches for your research problem. Based on the [Idea Catalyst](https://github.com/pkargupta/idea_catalyst) framework (Kargupta et al., 2025).

## When to Use

- You have a research problem and want inspiration from outside your field
- You're stuck on a challenge and want to explore cross-domain analogies
- You want to systematically discover how concepts from other fields could apply to your work
- You're writing a paper and need to identify novel cross-disciplinary connections

## How It Works

The pipeline follows four stages:

### 1. Problem Decomposition
Break down a research problem statement into targeted questions that capture different facets of the challenge — technical, conceptual, and methodological.

### 2. Target-Domain Literature Search
Search Semantic Scholar for papers in your own domain to understand the current landscape, gaps, and established approaches.

### 3. Cross-Domain Literature Search
Search other domains for papers that address analogous challenges. The system reformulates your questions to be domain-agnostic, then searches across fields like biology, physics, economics, philosophy, etc.

### 4. Integration & Ranking
Synthesize cross-domain findings into concrete, actionable inspiration ideas ranked by relevance and transferability to your original problem.

## Setup

### Prerequisites

```bash
git clone https://github.com/pkargupta/idea_catalyst.git
cd idea_catalyst
pip install -r requirements.txt
```

### API Keys

Create a `config.py` file with your Semantic Scholar API key:

```python
API_KEY = "your-semantic-scholar-api-key"
```

Get a free key at https://www.semanticscholar.org/product/api

### Running the Pipeline

```bash
python inspiration_pred.py \
  --problem_file data/cross-domain-inspiration-relations.json \
  --model_name Qwen/Qwen3-14B \
  --output_dir inspiration_pred_output \
  --max_papers_per_query 20 \
  --temp 0.7 \
  --min_rel_threshold 0.5
```

## Key Options

| Flag | Description |
|------|-------------|
| `--problem_file` | JSON file with research problems (see data format below) |
| `--model_name` | LLM to use for decomposition and synthesis |
| `--output_dir` | Where to write results |
| `--max_papers_per_query` | Max papers to retrieve per search query |
| `--temp` | Temperature for generation |
| `--min_rel_threshold` | Minimum relevance score to keep a cross-domain paper |
| `--skip_if_exists` | Skip problems that already have output files |

## Input Format

Your problem file should be a JSON array where each entry has at minimum:

```json
{
  "context": "Your research problem statement here",
  "source_domain": "Your field (e.g., Computer Science)",
  "target_domain": "Domain to search for inspiration (e.g., Biology)",
  "publication_year": 2024
}
```

## Output Format

Each output JSON file contains:
- **Problem metadata**: research problem, domains, ground truth references
- **Cross-domain evidence**: papers grouped by question and domain
- **Idea rankings**: integrated ideas ranked by relevance and transferability

## Using with Claude Code

Instead of running the full pipeline, you can use this template to guide Claude through the same intellectual process manually:

1. **Describe your research problem** — be specific about the challenge
2. **Ask Claude to decompose it** — "Break my research problem into 3-5 targeted questions"
3. **Request cross-domain search** — "What fields outside [your domain] have solved analogous problems?"
4. **Synthesize inspirations** — "How could [cross-domain concept] be adapted to my problem?"

This conversational approach works when you don't need the full automated pipeline but want the structured thinking framework.

## Tips

- Start with a clear, specific problem statement — vague problems yield vague inspirations
- Try multiple target domains — the best insights often come from unexpected fields
- The `min_rel_threshold` parameter controls quality vs. quantity of results
- Use `--skip_if_exists` for large batches to resume interrupted runs
- The default dataset comes from CHIMERA (cross-domain inspiration relations)
README.md

What This Does

Systematically finds transferable insights from other scientific domains for your research problem. Based on the Idea Catalyst framework (Kargupta et al., 2025), it decomposes your problem into questions, searches your domain and cross-domain literature via Semantic Scholar, then integrates and ranks inspirations by transferability.


Quick Start

Step 1: Clone the Repository

git clone https://github.com/pkargupta/idea_catalyst.git
cd idea_catalyst
pip install -r requirements.txt

Step 2: Download the Template

Click Download above to get the CLAUDE.md file and place it in the idea_catalyst/ directory.

Step 3: Start Working

claude

Say: "Find cross-domain inspiration for my research problem: [describe your challenge]"


The Four-Stage Pipeline

Stage What Happens
1. Decompose Break research problem into targeted questions (technical, conceptual, methodological)
2. Target Search Search your domain's literature for current landscape and gaps
3. Cross-Domain Search Reformulate questions as domain-agnostic, search other fields
4. Integrate & Rank Synthesize cross-domain findings into ranked inspiration ideas

Prerequisites

  • Python 3.10+
  • Semantic Scholar API key (free)
  • GPU recommended for local LLM inference (uses vLLM)
  • Dependencies: torch, transformers, vllm, spacy, pandas, scikit-learn

Key Options

Flag Description
--model_name LLM for decomposition and synthesis (default: Qwen3-14B)
--max_papers_per_query Papers to retrieve per search query
--min_rel_threshold Minimum relevance score to keep (higher = fewer, better results)
--skip_if_exists Resume interrupted batch runs

Using Without the Pipeline

You can use the Idea Catalyst thinking framework directly with Claude — no setup needed:

  1. Describe your problem — be specific about the research challenge
  2. Decompose — "Break this into 3-5 targeted research questions"
  3. Cross-domain search — "What fields outside [your domain] have solved similar problems?"
  4. Synthesize — "How could [concept from another field] adapt to my problem?"

Tips

  • Start with a clear, specific problem statement — vague problems yield vague inspirations
  • Try multiple target domains — the best insights often come from unexpected fields
  • The min_rel_threshold controls quality vs. quantity: higher = fewer but more relevant
  • The dataset is derived from CHIMERA cross-domain inspiration relations

Example Prompts

"Find cross-domain inspiration for improving transformer efficiency in NLP"
"What can drug discovery learn from supply chain optimization?"
"Decompose my research problem into cross-domain searchable questions"
"Search biology and physics for analogies to my distributed systems challenge"
"Rank these cross-domain ideas by transferability to my problem"

$Related Playbooks