Python Script to Read and Display Word Document Content

Chapter 1: Introduction to Document Reading

In this section, we will explore a Python script designed to read and present the number of paragraphs and their respective content from a specified Word document. This functionality is particularly beneficial for tasks such as document statistics and text analysis.

The following code snippet utilizes the docx module to access a specified Word document. It initializes the Document object with the desired Word file, allowing it to be loaded into memory. The script calculates the total number of paragraphs using len(doc.paragraphs) and displays it. Each paragraph is then iterated over with the enumerate function, which provides both the index and content of each paragraph. The results are printed using formatted strings.

import docx

doc = docx.Document("script.docx")

print(f"Number of paragraphs: {len(doc.paragraphs)}")

for count, para in enumerate(doc.paragraphs, start=1):

print(f"{count}: {para.text}")

Chapter 2: Retrieving and Modifying Paragraph Content

The subsequent code snippet opens a Word document (.docx) using the python-docx package. This code retrieves the content and number of runs in specific paragraphs and displays the text of each run. Finally, the modified document is saved to a specified file path.

The code starts by opening the document at the defined path using docx.Document(doc_path). It then counts the paragraphs with len(doc.paragraphs) and prints the result.

Next, it processes selected paragraph indices (in this case, indices 0 and 2) and retrieves the paragraph object for the specified index using doc.paragraphs[p_idx]. The script counts the runs in the paragraph with len(para.runs) and prints this information.

In the next loop, it iterates through each run within the paragraph, extracting and printing the text content of each run using run.text. Finally, the modified document is saved using doc.save("Modified_Document.docx").

import docx

doc_path = "script.docx"

doc = docx.Document(doc_path)

print("Number of paragraphs: ", len(doc.paragraphs))

for p_idx in [0, 2]:

para = doc.paragraphs[p_idx]

print("Number of runs: ", len(para.runs))

for run in para.runs:

print(run.text)

doc.save("Modified_Document.docx") # Specify the path for saving

Chapter 3: Extracting Heading Paragraphs

The following code focuses on reading and printing the heading paragraphs (those with style names starting with "Heading") from a specified Word document. Utilizing the python-docx module, this code effectively handles the Word document's contents.

Importing the `docx` Module: This line allows access to the functionalities provided by the python-docx library.
Opening the Word Document: A doc object is created by invoking the Document class to open the specified Word document ("script.docx"). This enables reading and processing the document's content.
Counting Paragraphs: The total number of paragraphs in the document is calculated using the len function and printed.
Iterating Through Paragraphs: A loop iterates through each paragraph in the document, checking if the style name begins with "Heading". If true, the text of that paragraph is printed.

In summary, this code reads the total number of paragraphs in a specified Word document and prints all heading paragraphs.

import docx

doc = docx.Document("script.docx")

print("Paragraphs: ", len(doc.paragraphs))

for para in doc.paragraphs:

if para.style.name.startswith('Heading'):

print(para.text)

dxalxmur.com

Python Script to Read and Display Word Document Content

Chapter 1: Introduction to Document Reading

Chapter 2: Retrieving and Modifying Paragraph Content

Chapter 3: Extracting Heading Paragraphs

Share the page:

Recent Post:

# Are You Bound? Exploring John Wesley's Spiritual Questions for Lent

The Power of Nostalgia: A Profitable Business Strategy

Mercari's Global Journey: Expanding from Japan to Taiwan

Navigating Startup Equity: Is It a Gamble or an Opportunity?

generate a compelling case for embracing a mundane life

# Exploring Faster Than Light Travel: Theoretical Implications

How Hot Can It Get Before Humans Are No Longer Able to Survive?

# The Enduring Legacy of Our Birth Charts Beyond Death