Understanding Challenging Python Interview Questions: Part I
Written on
Based on my career experiences, I have compiled three of the most challenging questions I've faced in Python interviews. While they may not represent the hardest questions overall, they certainly feature on many Python developers' lists.
In this article, I will present these questions along with explanations, aiming to equip you with a better understanding for future interviews. This will be the first installment of a two-part series; the next article will cover three additional questions. While this piece won't delve deeply into each topic, I will strive to provide a clear grasp of these concepts.
1. Global Interpreter Lock (GIL)
A few years ago, I encountered a question about the Global Interpreter Lock, or GIL, and I realized I was not adequately prepared to respond. Although I had heard of GIL multiple times, I had never taken the time to study it thoroughly.
To explain GIL, I must first clarify the concept of a process, and in order to understand a process, one must also comprehend threads.
— Understanding Process
In the context of Python, a process can be defined as an instance of a running Python program. Each process operates as an independent unit with its own memory, environment, CPU time, and resources. Consequently, if you run multiple Python programs simultaneously, they will function independently of one another.
The idea of a process is crucial, and we will revisit it later with practical examples.
— Understanding Thread
To simplify this explanation, I'll use a real-world analogy akin to a previous discussion about a restaurant and a chef.
Imagine you are a chef working in a restaurant. The kitchen serves as your "office," where all the essential activities occur. The kitchen represents the process — the place where ingredients and cooking tools are stored.
However, to prepare meals, you need someone — the chef. The chef symbolizes the thread; they access the ingredients (memory) and utilize the tools (CPU).
In Python, when a program is executed, it begins with a single thread, known as the main thread. Python allows the creation of additional threads within a single process, enabling them to share the same memory. This is analogous to a restaurant with multiple chefs managing various orders.
— Now, Let's Understand GIL
When a Python process is initiated, it begins with a single thread — the main thread — due to the GIL, a mutex that guarantees only one thread can execute Python bytecode at any one time.
The GIL's purpose is to simplify memory management and ensure thread safety. As mentioned earlier, threads within the same process share memory, which can lead to the risk of a race condition.
A race condition arises when multiple threads access shared data simultaneously. Depending on how your system is designed, if one thread alters the shared data, it could have serious repercussions for the other thread.
# Thread A - Salary payment
salary = 4000
def pay_salary():
# make the payment
global salary
# process the salary
# Thread B - Read and modify salary's value
def update_salary():
global salary
salary = 5000
# Do something else
In this scenario, we have two threads — A and B. Thread A handles the salary payment based on the current variable value, while Thread B can read and change the salary value.
If both threads run concurrently, and Thread B updates the salary before Thread A processes the payment, the result could be an incorrect salary being paid out.
The GIL presents a trade-off in terms of concurrency. While it facilitates memory management, it can also hinder performance for CPU-bound tasks and multithreading, which we will discuss next.
2. Multiprocessing vs. Multithreading
This question typically arises in the form: "What distinguishes multiprocessing from multithreading?"
— Multiprocessing
In simple terms, multiprocessing allows for the simultaneous creation of multiple processes. Each process has its own Python interpreter, memory space, and GIL.
By default, a Python process uses only a single CPU core, even if the machine has eight or more cores.
By leveraging multiprocessing, we can optimize our machine's capabilities and enhance performance by utilizing multiple cores to run more processes concurrently. This can be accomplished using the multiprocessing package in Python.
# built-in package
import multiprocessing
Using the multiprocessing module provides true parallelism, enabling the execution of multiple tasks simultaneously. This parallelism can be implemented using the Pool object:
from multiprocessing import Pool
from dataclasses import dataclass
@dataclass
class MessageObject:
recepient: str
message: str
def send_message(msg_obj: MessageObject) -> None:
print(f"sending message to - {msg_obj.recepient}.")
# do some cool stuff
if __name__ == '__main__':
# Create processes and run in parallel
with Pool(4) as pool:
pool.map(
send_message,
[
MessageObject(recepient="Foo", message="Bar"),
MessageObject(recepient="John", message="Doe")
]
)
In the example above, we create four processes using the Pool object. Python will automatically utilize the number of available CPUs unless specified otherwise. We have two tasks (sending two messages) assigned to the processes.
In this case, since there are fewer tasks than processes, only two processes will execute the tasks. Otherwise, tasks would be allocated across the available processes.
Multiprocessing is particularly beneficial for CPU-bound tasks that require performance enhancements during heavy computations.
— Multithreading
By default, Python starts only one thread in a process to prevent race conditions. To bypass the limitations imposed by the GIL and to utilize multiple threads within a single process, we can implement multithreading using the built-in threading module.
import threading
Let's utilize the same example from multiprocessing to create multiple threads:
import threading
from dataclasses import dataclass
@dataclass
class MessageObject:
recepient: str
message: str
def send_message(msg_obj: MessageObject) -> None:
print(f"sending message to - {msg_obj.recepient}.")
# do some cool stuff
if __name__ == '__main__':
# Create a thread for each message we want to send
foo_thread = threading.Thread(target=send_message, args=(MessageObject(recepient="Foo", message="Bar"),))
john_thread = threading.Thread(target=send_message, args=(MessageObject(recepient="John", message="Doe"),))
# Start our threads
foo_thread.start()
john_thread.start()
# Wait for threads to complete
foo_thread.join()
john_thread.join()
In this example, we create new threads using the Thread object and initiate them with the start method. The join method blocks the calling thread until the thread whose join method was called finishes.
Multithreading is particularly useful for I/O-bound tasks, such as making external API requests.
3. Concurrency in Python
"Concurrency is dealing with a lot of things at once." — Rob Pike
In Python, we can execute concurrent tasks using either of the two methods discussed earlier — multiprocessing and multithreading — or through asynchronous programming.
Final Thoughts
Interviews can be challenging, but they also present valuable learning opportunities. Many questions I couldn't answer in the past, and some I attempted only to find I made things worse.
I still maintain numerous notes from prior interviews, highlighting areas for improvement. In my next article, I will gather and share three more questions with you.
About the author: Yanick is a Solutions Engineer working for a Spanish company and residing in Portugal. He has been coding in Python since 2018 and writing on Medium since 2020.
I aim to share intriguing insights about Python and programming that can assist you in your daily tasks. Join me on this journey!