Background

When I prepare classes, I spend a significant amount of time (maybe 20-30% of the whole process) finding or constructing **perfect** language examples or images that I can use for my Japanese language class. A classic example is images for the unit on counters (e.g., ไธ€ๆžšใ€ไบŒๆžšใ€ไธ€ๅ†Šใ€ไบŒๅ†Š etc.), where an instructor has to find or create images for different numbers of objects and different shapes. It's a lot harder than you think to find an image that fits the exact description (try finding an image of 8 pieces of paper, for example).

So I became very curious what AI-based image generation tools can offer and wanted to see how useful they can be in designing images, text, and audio that can be used for language instruction.

Starting with images, I have tried to create several different types of instructional materials with AI. Below are some example resources generated by AI and the prompts that I used to design these materials.

OpenAI's DALL-E, Stable Diffusion, Stability AI, and Midjourney released text-to-image models in 2023. Each model had its own strengths (and weaknesses), but overall they successfully produced target objects (such as "books" and "apples") as photorealistic images. The generated images were not consistent, especially when I provided detailed prompts -- for example, the image was not accurate when I asked to generate "five apples and two bananas on the table."

The cost was another challenge. The cost of generating images and videos with AI is usually significantly higher than that for text. For example, the latest models in 2025 (Imagen-4, etc.) cost about $0.20โ€“$0.40 per image. If we need to generate dozens of images regularly, the cost is a prohibitive factor for instructional use.

A huge amount of criticism of text-to-image technology came from creative professionals such as artists, photographers, and videographers. While I personally have no objection to using AI-generated images, I agree with the need for ethical practices in using AI-generated images (or AI in general). One suggested practice is to always disclose that an image is AI-generated when using AI-generated images. Unfortunately, there has been no consensus on the ethical approach to AI-generated images. While some believe disclosure is good enough, others feel any use of AI-generated images is offensive. This is a major hurdle for anyone who intends to use AI for their classroom instruction.

Below are the first set of images that I generated with Stability AI in 2023. The prompt for these models was rather simple since the image went wild when I provided a too long prompt. If I want to have an image of books with these models, the overall quality was better with simple commands such as "books" or "books on the desk" rather than a long detailed description. An example prompt for each image is also presented.

Sample Command (StabilityAI)
A realistic photo of following object for instructional materials. Simplify the image so that it highlights the object: [A SINGLE WORD TARGET (such as "teacher" and "apple")]

[INFO] Generated image ....
๐Ÿ–Œ๏ธ Images
Photo Photo Photo Photo Photo Photo Photo Photo Photo Photo Photo Photo Photo Photo Photo
See more examples in my Dropbox folder: https://t-nagano.link/FunWithAI-StabilityAI

Google's Imagen 3 was probably the most advanced AI image generation tool in 2024. Imagen 3 was particularly good at understanding the instructions (prompts) written in natural language. It claims to be able to generate English text (such as a sign on a store), but it often produced errors. It never produced correct Japanese text either. The prompt must be in English -- if I use a Japanese word in the prompt, it often produces erratic images (you can see some of them in the sample images below).

The price for image generation became accessible in the range of $0.1-$0.2 in this model. In late 2024, I generated many images that can be used for our Japaense language courses. The first set of images are for the vocabulary items in the Genki textbooks. Below are the prompts and images that it generated.

Sample Prompt (Imagen3)
Create a photorealistic image without any text that illustrates the following expression for learners of Japanese.: teacher or professor.

[INFO] Generated image ....
๐Ÿ–Œ๏ธ Images (examples)
Photo Photo Photo Photo Photo Photo Photo Photo Photo Photo Photo Photo
The original images are available in my Dropbox folder: https://t-nagano.link/FunWithAI-Imagen3-1

In late 2024, DeepSeek started to offer an extremely affordable API access to their reasoning model (DeepSeek R1), which turned out to be extremely good at Japanese language resources (see below for more information about this). I tried using a combination of DeepSeek R1 and Imagen3 to make more complex images. Some examples are

  • For the unit of invitation ใ€Œใพใ›ใ‚“ใ‹ใ€ in Chapter 3, I made images for the following expressions:
    • ใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚
    • ๆ˜ ็”ปใ‚’่ฆ‹ใพใ›ใ‚“ใ‹ใ€‚
    • ใ‚ปใƒณใƒˆใƒฉใƒซใƒ‘ใƒผใ‚ฏใงๆ•ฃๆญฉใ—ใพใ›ใ‚“ใ‹
    • ๅ›ณๆ›ธ้คจใงๅ‹‰ๅผทใ—ใพใ›ใ‚“ใ‹ใ€‚
    • ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎ็พŽ่ก“้คจใธ่กŒใใพใ›ใ‚“ใ‹ใ€‚ ... etc.

The process involved two procedures:

  • Creating a list of practice sentences for the target grammar (e.g., ใ€Œใพใ›ใ‚“ใ‹ใ€) with DeepSeek (see below for more information about this)
  • Generating images for these example sentences with Imagen3

Since Imagen3 does not perform very well with Japaense prompts, I had to create English translations for the Japanese practice sentences. I noticed that Imagen3 did not generate any meaningful Japaense text, so I have added an instruction not to use any text in the image (but the model still generated text sometimes.). People from the Asian background were overrepresented in images, so I also added an instruction to diversity the people's backgrounds.

Sample Prompt (DeepSeek R1)
ใ€Œ๏ฝžใพใ›ใ‚“ใ‹ (e.g., ใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚)ใ€ใ‚’ไฝฟใฃใŸไพ‹ๆ–‡ใ‚’ใ€ๆ—ฅๆœฌ่ชžใฎๆ•™็ง‘ๆ›ธใ€Œใ’ใ‚“ใใ€ใ‚’ไฝฟใฃใฆใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎใƒฉใ‚ฌใƒผใƒ‡ใ‚ฃใ‚ขใ‚ณใƒŸใƒฅใƒ‹ใƒ†ใ‚ฃใƒผใ‚ซใƒฌใƒƒใ‚ธใงๆ—ฅๆœฌ่ชžใ‚’ๅญฆ็ฟ’ใ—ใฆใ„ใ‚‹ๅญฆ็”ŸใฎใŸใ‚ใซ50ๅ€‹ไฝœใฃใฆใใ ใ•ใ„ใ€‚ใ€Œใ’ใ‚“ใใ€ใฎCh.1-Ch.03ใพใงใฎๅ˜่ชžใ‚’ไฝฟใ†ใ‚ˆใ†ใซใ—ใฆใ€ใใฎ็ฏ„ๅ›ฒๅ†…ใฎๅ˜่ชžใ ใ‘ใฎไพ‹ๆ–‡ใ‚’ไฝœใ‚‹ใ‚ˆใ†ใซใ—ใฆใใ ใ•ใ„ใ€‚ใƒˆใƒ”ใƒƒใ‚ฏใฏใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใซ้–ขใ™ใ‚‹ๅ ดๆ‰€ใ‚„ใ‚คใƒ™ใƒณใƒˆใ‚’ๆ™‚ๆŠ˜ใ€ไฝฟใ†ใ‚ˆใ†ใซใ—ใฆใใ ใ•ใ„ใ€‚ๅ‡บๅŠ›ใฏใƒ‡ใƒผใ‚ฟใฎใฟใซใ—ใฆใ€ใ€Œๆ—ฅๆœฌ่ชžใฎไพ‹ๆ–‡; ่‹ฑ่ชžใฎ่จณใ€ๅฝขๅผใงๅ‡บใ—ใฆใ€่งฃ่ชฌใฏใคใ‘ใชใ„ใงใใ ใ•ใ„ใ€‚

ใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚; Would you like to drink coffee?
ๆ˜ ็”ปใ‚’่ฆ‹ใพใ›ใ‚“ใ‹ใ€‚; Would you like to watch a movie?
ใ‚ปใƒณใƒˆใƒฉใƒซใƒ‘ใƒผใ‚ฏใงๆ•ฃๆญฉใ—ใพใ›ใ‚“ใ‹ใ€‚; Would you like to take a walk in Central Park?
ๅ›ณๆ›ธ้คจใงๅ‹‰ๅผทใ—ใพใ›ใ‚“ใ‹ใ€‚; Would you like to study at the library?
ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎ็พŽ่ก“้คจใธ่กŒใใพใ›ใ‚“ใ‹ใ€‚; Would you like to go to a museum in New York?
ๆ˜ผใ”้ฃฏใ‚’้ฃŸในใพใ›ใ‚“ใ‹ใ€‚; Would you like to eat lunch?
็ด…่Œถใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚; Would you like to drink tea?
ๅ‹้”ใซไผšใ„ใพใ›ใ‚“ใ‹ใ€‚; Would you like to meet a friend?
ใ‚ฟใ‚คใƒ ใ‚บใ‚นใ‚ฏใ‚จใ‚ขใ‚’่ฆ‹ใพใ›ใ‚“ใ‹ใ€‚; Would you like to see Times Square?
้Ÿณๆฅฝใ‚’่žใใพใ›ใ‚“ใ‹ใ€‚; Would you like to listen to music?
ๆœฌใ‚’่ชญใฟใพใ›ใ‚“ใ‹ใ€‚; Would you like to read a book?
ๅ…ฌๅœ’ใงใƒ”ใ‚ฏใƒ‹ใƒƒใ‚ฏใ—ใพใ›ใ‚“ใ‹ใ€‚; Would you like to have a picnic in the park?
ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎใƒ‘ใƒฌใƒผใƒ‰ใ‚’่ฆ‹ใพใ›ใ‚“ใ‹ใ€‚; Would you like to watch a parade in New York?
... [snap]

A note about the diverse background was added to the prompt. The skewed representation of people in Imagen3 became a major concern (i.e., minorities were not represented in images). When the target sentence included "Japanese," people from Asian backgrounds were overrepresented. For example, many Japanese people appear for the sentence "I go to the Japanese class," which does not make sense since those students should not look like Japanese. AI-generated nudity became an issue, and Imagen3 later added a very strong filter, which blocked a certain set of words such as "student", "young", and "girls." The prompts, therefore, had to be modified -- for example, I had to use "a classroom packed with people" instead of "a classroom packed with students"

A note about text/characters was also added since Imagen3 was never able to produce correct text in their images. About 50-70% of English text was wrong and almost none of Japanese text was correct.

Sample Prompt (Imagen3)
Make a realistic photo for the following sentence. Make sure to represent a diverse background. Do not include any text or written characters. Would you like to drink coffee?

[INFO] Generated image ....
๐Ÿ–Œ๏ธ Images
Photo Photo Photo Photo Photo Photo Photo Photo Photo Photo Photo Photo
The original images are available in my Dropbox folder: https://t-nagano.link/FunWithAI-Imagen3-2

UNDER CONSTRUCTION

Sample Prompt
A realistic photo of following object for instructional materials. Simplify the image so that it highlights the object: [A SINGLE WORD TARGET (such as "teacher" and "apple")]

[INFO] OUTPUT... ....
๐Ÿ–Œ๏ธ Images
Photo
See more examples in my Dropbox folder: xxxx

DeepSeek was introduced in the U.S. in 2025 and offered a highly capable reasoning model for a significantly reduced cost. In my observation, DeepSeek outperforms in topics of Japanese language learning, probably because the model has been trained with a large number of Japanese language learning materials (DeepSeek was developed in China). For example, DeepSeek understands what kind of words and grammatical structures we should use just by saying "Use Japanese appropriate for students who are studying chapter 3 of the Genki textbook," etc.

Its reasoning model (R1) was extremely affordable (less than $0.01 for each query), and it was possible to produce a large number of text-based materials with it. As an initial project, I generated a large number of example sentences that I can use in my class. The generated output was appropriate for the students' level and its topics were diverse. To be honest, it generated much better example sentences than I make, since I tend to use the same topics/sentence patterns in my examples. Errors did happen, but they were rare (about 1 in 100-200).

Sample Prompt (DeepSeek R1)
ใ€Œ๏ฝžใพใ›ใ‚“ใ‹ (e.g., ใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚)ใ€ใ‚’ไฝฟใฃใŸไพ‹ๆ–‡ใ‚’ใ€ๆ—ฅๆœฌ่ชžใฎๆ•™็ง‘ๆ›ธใ€Œใ’ใ‚“ใใ€ใ‚’ไฝฟใฃใฆใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎใƒฉใ‚ฌใƒผใƒ‡ใ‚ฃใ‚ขใ‚ณใƒŸใƒฅใƒ‹ใƒ†ใ‚ฃใƒผใ‚ซใƒฌใƒƒใ‚ธใงๆ—ฅๆœฌ่ชžใ‚’ๅญฆ็ฟ’ใ—ใฆใ„ใ‚‹ๅญฆ็”ŸใฎใŸใ‚ใซ50ๅ€‹ไฝœใฃใฆใใ ใ•ใ„ใ€‚ใ€Œใ’ใ‚“ใใ€ใฎCh.1-Ch.03ใพใงใฎๅ˜่ชžใ‚’ไฝฟใ†ใ‚ˆใ†ใซใ—ใฆใ€ใใฎ็ฏ„ๅ›ฒๅ†…ใฎๅ˜่ชžใ ใ‘ใฎไพ‹ๆ–‡ใ‚’ไฝœใ‚‹ใ‚ˆใ†ใซใ—ใฆใใ ใ•ใ„ใ€‚ใƒˆใƒ”ใƒƒใ‚ฏใฏใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใซ้–ขใ™ใ‚‹ๅ ดๆ‰€ใ‚„ใ‚คใƒ™ใƒณใƒˆใ‚’ๆ™‚ๆŠ˜ใ€ไฝฟใ†ใ‚ˆใ†ใซใ—ใฆใใ ใ•ใ„ใ€‚ๅ‡บๅŠ›ใฏใƒ‡ใƒผใ‚ฟใฎใฟใซใ—ใฆใ€ใ€Œๆ—ฅๆœฌ่ชžใฎไพ‹ๆ–‡; ่‹ฑ่ชžใฎ่จณใ€ๅฝขๅผใงๅ‡บใ—ใฆใ€่งฃ่ชฌใฏใคใ‘ใชใ„ใงใใ ใ•ใ„ใ€‚

ใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚; Would you like to drink coffee?
ๆ˜ ็”ปใ‚’่ฆ‹ใพใ›ใ‚“ใ‹ใ€‚; Would you like to watch a movie?
ใ‚ปใƒณใƒˆใƒฉใƒซใƒ‘ใƒผใ‚ฏใงๆ•ฃๆญฉใ—ใพใ›ใ‚“ใ‹ใ€‚; Would you like to take a walk in Central Park?
ๅ›ณๆ›ธ้คจใงๅ‹‰ๅผทใ—ใพใ›ใ‚“ใ‹ใ€‚; Would you like to study at the library?
ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎ็พŽ่ก“้คจใธ่กŒใใพใ›ใ‚“ใ‹ใ€‚; Would you like to go to a museum in New York?
ๆ˜ผใ”้ฃฏใ‚’้ฃŸในใพใ›ใ‚“ใ‹ใ€‚; Would you like to eat lunch?
็ด…่Œถใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚; Would you like to drink tea?
ๅ‹้”ใซไผšใ„ใพใ›ใ‚“ใ‹ใ€‚; Would you like to meet a friend?
ใ‚ฟใ‚คใƒ ใ‚บใ‚นใ‚ฏใ‚จใ‚ขใ‚’่ฆ‹ใพใ›ใ‚“ใ‹ใ€‚; Would you like to see Times Square?
้Ÿณๆฅฝใ‚’่žใใพใ›ใ‚“ใ‹ใ€‚; Would you like to listen to music?
ๆœฌใ‚’่ชญใฟใพใ›ใ‚“ใ‹ใ€‚; Would you like to read a book?
ๅ…ฌๅœ’ใงใƒ”ใ‚ฏใƒ‹ใƒƒใ‚ฏใ—ใพใ›ใ‚“ใ‹ใ€‚; Would you like to have a picnic in the park?
ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎใƒ‘ใƒฌใƒผใƒ‰ใ‚’่ฆ‹ใพใ›ใ‚“ใ‹ใ€‚; Would you like to watch a parade in New York?
ใ‚ฑใƒผใ‚ญใ‚’้ฃŸในใพใ›ใ‚“ใ‹ใ€‚; Would you like to eat cake?
ๆฐดใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚; Would you like to drink water?
ๆ˜ ็”ป้คจใธ่กŒใใพใ›ใ‚“ใ‹ใ€‚; Would you like to go to the movie theater?
ๅญฆๆ กใงๆ—ฅๆœฌ่ชžใ‚’ๅ‹‰ๅผทใ—ใพใ›ใ‚“ใ‹ใ€‚; Would you like to study Japanese at school?
่‡ช็”ฑใฎๅฅณ็ฅžใ‚’่ฆ‹ใพใ›ใ‚“ใ‹ใ€‚; Would you like to see the Statue of Liberty?
ใ‚ณใƒผใƒ’ใƒผใ‚ทใƒงใƒƒใƒ—ใงไผ‘ใฟใพใ›ใ‚“ใ‹ใ€‚; Would you like to rest at a coffee shop?
ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎใ‚ณใƒณใ‚ตใƒผใƒˆใธ่กŒใใพใ›ใ‚“ใ‹ใ€‚; Would you like to go to a concert in New York?
ใƒ†ใƒฌใƒ“ใ‚’่ฆ‹ใพใ›ใ‚“ใ‹ใ€‚; Would you like to watch TV?
้ฃŸไบ‹ใ‚’ใ—ใพใ›ใ‚“ใ‹ใ€‚; Would you like to have a meal?
ๆ•ฃๆญฉใ—ใพใ›ใ‚“ใ‹ใ€‚; Would you like to take a walk?
... [snap]
๐Ÿ“„ Text/PDF

DeepSeek can generate an extremely large number of text-based outputs. There are overa 1,000 sentences for each target structure. See the all files (in text format) in my Dropbox folder: https://t-nagano.link/FunWithAI-DeepSeekSent

DeepSeek is particularly good at generating text for Japaense language learners (probably because DeepSeek, a Chinese LLM initiative, has access to a large amount of materials for Japaense language learners for training.

Here I tried to generate reading passages for each target grammar as well as for each Genki textbook chapter.

Sample Prompt (DeepSeek R1)
ใ€Œ๏ฝžใพใ›ใ‚“ใ‹ (e.g., ใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚)ใ€ใ‚’ไฝฟใฃใŸ50่ชž็จ‹ๅบฆใฎ่ชญใฟ็‰ฉใ‚’ใ€ๆ—ฅๆœฌ่ชžใฎๆ•™็ง‘ๆ›ธใ€Œใ’ใ‚“ใใ€ใ‚’ไฝฟใฃใฆๅ‹‰ๅผทใ—ใฆใ„ใ‚‹ๆ—ฅๆœฌ่ชžใ‚’ๅญฆ็ฟ’ใ—ใฆใ„ใ‚‹ๅญฆ็”ŸใฎใŸใ‚ใซไฝœใฃใฆใใ ใ•ใ„ใ€‚ใ€Œใ’ใ‚“ใใ€ใฎCh.1-Ch.03ใฎๅ˜่ชžใ‚’ไฝฟใ†ใ‚ˆใ†ใซใ—ใฆใ€ใใฎ็ฏ„ๅ›ฒๅ†…ใฎๅ˜่ชžใ ใ‘ใ‚’ๅˆฉ็”จใ—ใฆ่ชญใฟ็‰ฉใ‚’ไฝœใ‚‹ใ‚ˆใ†ใซใ—ใฆใใ ใ•ใ„ใ€‚่ชญใฟ็‰ฉใฏใ€ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎใƒˆใƒ”ใƒƒใ‚ฏใ€็‰นใซใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎใƒฉใ‚ฌใƒผใƒ‡ใ‚ฃใ‚ขใ‚ณใƒŸใƒฅใƒ‹ใƒ†ใ‚ฃใƒผใ‚ซใƒฌใƒƒใ‚ธใงๆ—ฅๆœฌ่ชžใ‚’ๅ‹‰ๅผทใ—ใฆใ„ใ‚‹ๅญฆ็”ŸใŒ่ˆˆๅ‘ณใŒใ‚ใ‚‹ใ‚ˆใ†ใชใƒˆใƒ”ใƒƒใ‚ฏ(ใ‚ขใƒ‹ใƒกใ‚„ใ‚ฒใƒผใƒ ใ€ๆผซ็”ปใชใฉ)ใ‚’ๅ–ใ‚Šๆ‰ฑใ†ใ‚ˆใ†ใซใ—ใฆใ€5-6่ชž็จ‹ๅบฆใฎใ‚ฟใ‚คใƒˆใƒซใงๅง‹ใพใ‚‹ใ‚ˆใ†ใซใ—ใฆใใ ใ•ใ„ใ€‚ๅ‡บๅŠ›ใฏใƒ‡ใƒผใ‚ฟใฎใฟใซใ—ใฆใ€่งฃ่ชฌใฏใคใ‘ใชใ„ใงใใ ใ•ใ„ใ€‚

ๅ…ฌๅœ’ใงๆœฌใ‚’่ชญใฟใพใ›ใ‚“ใ‹

ใ“ใ‚“ใซใกใฏใ€‚็งใฏ้ˆดๆœจใงใ™ใ€‚ๅคงๅญฆใฎๅญฆ็”Ÿใงใ™ใ€‚ๅ‹้”ใฎ้ซ˜ๆฉ‹ใ•ใ‚“ใŒใ„ใพใ™ใ€‚ไปŠๆ—ฅใ€ๅคฉๆฐ—ใŒใ„ใ„ใงใ™ใ€‚้ซ˜ๆฉ‹ใ•ใ‚“ใŒใ€Œๅ…ฌๅœ’ใงๆœฌใ‚’่ชญใฟใพใ›ใ‚“ใ‹ใ€‚ใ€ใจ่จ€ใ„ใพใ™ใ€‚็งใฏใ€Œใฏใ„ใ€่ชญใฟใพใ—ใ‚‡ใ†ใ€‚ใ€ใจ่จ€ใ„ใพใ™ใ€‚ๅ…ฌๅœ’ใธ่กŒใใพใ™ใ€‚ๅ…ฌๅœ’ใฏๅคงใใ„ใงใ™ใ€‚ๆœจใฎไธ‹ใงๆœฌใ‚’่ชญใฟใพใ™ใ€‚ๆœฌใฏ้ข็™ฝใ„ใงใ™ใ€‚ใใ‚Œใ‹ใ‚‰ใ€ใŠ่Œถใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚้ซ˜ๆฉ‹ใ•ใ‚“ใฏใ€Œใ„ใ„ใงใ™ใญใ€‚ใ€ใจ่จ€ใ„ใพใ™ใ€‚ๆฅฝใ—ใ„ๅˆๅพŒใงใ™ใ€‚

๐Ÿ“„ Text/PDF

DeepSeek can generate an extremely large number of text-based outputs. See the all files (in text format) in my Dropbox folder: https://t-nagano.link/FunWithAI-DeepSeekPass

These are reading passages for each chapter of the Genki textbook. Beginning-level passages (Ch01-Ch05) tend to use many vocabulary items and structures that the chapters have not covered yet, but they can still be useful with some modifications/corrections. Those after Chapter 6 have a good balance of the target vocabulary items and structures and new items, which are often presented with sufficient context, and proficient readers will be able to make reasonable guesses.

Sample Prompt (DeepSeek R1)
ๆ—ฅๆœฌ่ชžใฎๆ•™็ง‘ๆ›ธใ€Œใ’ใ‚“ใใ€ใฎCh.3ใฎใƒˆใƒ”ใƒƒใ‚ฏใ€ๆ–‡ๆณ•้ …็›ฎใ€ใƒœใ‚ญใƒฃใƒ–ใƒฉใƒชใƒผใซๆฒฟใฃใŸ50่ชž็จ‹ๅบฆใฎ่ชญใฟ็‰ฉใ‚’ใ€ๆ—ฅๆœฌ่ชžใฎๆ•™็ง‘ๆ›ธใ€Œใ’ใ‚“ใใ€ใ‚’ไฝฟใฃใฆๅ‹‰ๅผทใ—ใฆใ„ใ‚‹ๆ—ฅๆœฌ่ชžใ‚’ๅญฆ็ฟ’ใ—ใฆใ„ใ‚‹ๅญฆ็”ŸใฎใŸใ‚ใซไฝœใฃใฆใใ ใ•ใ„ใ€‚ใ€Œใ’ใ‚“ใใ€ใฎCh.1-Ch.03ใฎๅ˜่ชžใ‚’ไฝฟใ†ใ‚ˆใ†ใซใ—ใฆใ€ใใฎ็ฏ„ๅ›ฒๅ†…ใฎๅ˜่ชžใ ใ‘ใ‚’ๅˆฉ็”จใ—ใฆ่ชญใฟ็‰ฉใ‚’ไฝœใ‚‹ใ‚ˆใ†ใซใ—ใฆใใ ใ•ใ„ใ€‚่ชญใฟ็‰ฉใฏใ€ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎใƒˆใƒ”ใƒƒใ‚ฏใ€็‰นใซใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎใƒฉใ‚ฌใƒผใƒ‡ใ‚ฃใ‚ขใ‚ณใƒŸใƒฅใƒ‹ใƒ†ใ‚ฃใƒผใ‚ซใƒฌใƒƒใ‚ธใงๆ—ฅๆœฌ่ชžใ‚’ๅ‹‰ๅผทใ—ใฆใ„ใ‚‹ๅญฆ็”ŸใŒ่ˆˆๅ‘ณใŒใ‚ใ‚‹ใ‚ˆใ†ใชใƒˆใƒ”ใƒƒใ‚ฏ(ใ‚ขใƒ‹ใƒกใ‚„ใ‚ฒใƒผใƒ ใ€ๆผซ็”ปใชใฉ)ใ‚’ๅ–ใ‚Šๆ‰ฑใ†ใ‚ˆใ†ใซใ—ใฆใ€5-6่ชž็จ‹ๅบฆใฎใ‚ฟใ‚คใƒˆใƒซใงๅง‹ใพใ‚‹ใ‚ˆใ†ใซใ—ใฆใใ ใ•ใ„ใ€‚ๅ‡บๅŠ›ใฏใƒ‡ใƒผใ‚ฟใฎใฟใซใ—ใฆใ€่งฃ่ชฌใฏใคใ‘ใชใ„ใงใใ ใ•ใ„ใ€‚

ๅญฆๆ กใงใฎไธ€ๆ—ฅใฎๅ‹‰ๅผท
ไปŠๆ—ฅใ€็งใฏๆœใ‹ใ‚‰ๅญฆๆ กใธ่กŒใใพใ—ใŸใ€‚ๆ•™ๅฎคใงๅ‹้”ใจไผšใ„ใพใ—ใŸใ€‚็งใŸใกใฏๆ—ฅๆœฌ่ชžใฎๅ‹‰ๅผทใ‚’ใ—ใพใ—ใŸใ€‚ๅ…ˆ็”ŸใŒใ€Œๆ•™็ง‘ๆ›ธใ‚’่ชญใ‚“ใงใใ ใ•ใ„ใ€ใจ่จ€ใ„ใพใ—ใŸใ€‚็งใฏๅฃฐใ‚’ๅ‡บใ—ใฆ่ชญใฟใพใ—ใŸใ€‚ๅ‹้”ใฏใ€Œ่ณชๅ•ใ—ใฆใ‚‚ใ„ใ„ใงใ™ใ‹ใ€ใจ่žใใพใ—ใŸใ€‚ๅ…ˆ็”Ÿใฏใ€Œใฏใ„ใ€ใฉใ†ใžใ€ใจ็ญ”ใˆใพใ—ใŸใ€‚ๆ˜ผไผ‘ใฟใซใ€้ฃŸๅ ‚ใงใ”้ฃฏใ‚’้ฃŸในใพใ—ใŸใ€‚ๅˆๅพŒใ€ๅ›ณๆ›ธ้คจใธ่กŒใฃใฆใ€ๆœฌใ‚’ๅ€Ÿใ‚Šใพใ—ใŸใ€‚ใจใฆใ‚‚ๆฅฝใ—ใ„ไธ€ๆ—ฅใงใ—ใŸใ€‚

๐Ÿ“„ Text/PDF

DeepSeek can generate an extremely large number of text-based outputs. See the all files (in text format) in my Dropbox folder: https://t-nagano.link/FunWithAI-DeepSeekPass

Task-based Language Learning (TBLL) is an extremely engaging and effective approach to language teaching. However, the major challenge in implementing TBLL is the amount of time and energy required to build meaningful tasks that are relevant to the learning objective (e.g., target grammar or vocabulary) and to students' lives.

It turned out that DeepSeek can generate reasonably good tasks. We can also make the tasks specific to students' environments by adjusting prompts.

In my experience, it works best to generate a few (probably 4-5) different tasks by adjusting prompts -- at least one of them is good enough for actual classroom instruction.

Sample Prompt (DeepSeek R1)
ใ€Œ๏ฝžใพใ›ใ‚“ใ‹ (e.g., ใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚)ใ€ใ‚’็ทด็ฟ’ใ™ใ‚‹task-basedใฎ่ชฒ้กŒใ‚’ใ€ๆ—ฅๆœฌ่ชžใฎๆ•™็ง‘ๆ›ธใ€Œใ’ใ‚“ใใ€ใ‚’ไฝฟใฃใฆๅ‹‰ๅผทใ—ใฆใ„ใ‚‹ๆ—ฅๆœฌ่ชžใ‚’ๅญฆ็ฟ’ใ—ใฆใ„ใ‚‹ๅญฆ็”ŸใฎใŸใ‚ใซไฝœใฃใฆใใ ใ•ใ„ใ€‚ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎๅคงๅญฆใ‚„็Ÿญๅคงใงๅ‹‰ๅผทใ—ใฆใ„ใ‚‹ๅญฆ็”ŸใฎใŸใ‚ใซใ€ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใซ้–ขไฟ‚ใ™ใ‚‹ใƒˆใƒ”ใƒƒใ‚ฏใ‚’็ฉๆฅต็š„ใซๅˆฉ็”จใ—ใฆใใ ใ•ใ„ใ€‚ใ€Œใ’ใ‚“ใใ€ใฎCh.1-Ch.03ใพใงใฎๅ˜่ชžใ‚’ไฝฟใ†ใ‚ˆใ†ใซใ—ใฆใ€ใใฎ็ฏ„ๅ›ฒๅ†…ใฎๅ˜่ชžใ ใ‘ใ‚’ๅˆฉ็”จใ™ใ‚‹ใ‚ˆใ†ใซๅฟƒใŒใ‘ใฆไธ‹ใ•ใ„ใ€‚ใ‚ฟใ‚นใ‚ฏใฏใ€ๅญฆ็”ŸใŒๅฎŸ้š›ใซ็›ด้ขใ™ใ‚‹ใ‚ˆใ†ใช็Šถๆณใ‚’่จญๅฎšใ™ใ‚‹ใŸใ‚ใ€่ฉณใ—ใ„ใƒใƒƒใ‚ฏใ‚ฐใƒฉใ‚ฆใƒณใƒ‰ใ‚„ใ‚ฟใ‚นใ‚ฏใง่งฃๆฑบใ™ในใๅ•้กŒใซใคใ„ใฆ่‹ฑ่ชžใง่ฉณใ—ใ่ฟฐในใฆใ€ใใฎๅพŒใซๅฟ…่ฆใชใƒœใ‚ญใƒฃใƒ–ใƒฉใƒชใƒผใ‚’ๆ็คบใ—ใ€ๆŒ‡ๅฐŽใฎใ‚นใƒ†ใƒƒใƒ—ใ‚’3-5ใ‚นใƒ†ใƒƒใƒ—ใง่กจ็คบใ—ใฆไธ‹ใ•ใ„ใ€‚ๅ‡บๅŠ›ใฏใƒ‡ใƒผใ‚ฟใฎใฟใซใ—ใฆใ€ๆ—ฅๆœฌ่ชžใฎๅ˜่ชžใ‚„ไพ‹ๆ–‡ไปฅๅค–ใฏ่‹ฑ่ชžใงๆ›ธใ„ใฆใ€ใใ—ใฆใ‚ฟใ‚นใ‚ฏใฎๅๅ‰ใ€ๅฟ…่ฆใชๆ•™ๆใ€็ทด็ฟ’ใ™ใ‚‹่จ€่ชžใ‚นใ‚ญใƒซใ€ๅฟ…่ฆใชๆ™‚้–“ใชใฉใ‚’ๆœ€ๅˆใซๆ›ธใ„ใฆใ€ใ‚ฟใ‚นใ‚ฏใฎ่ฉณ็ดฐใชใ‚นใƒ†ใƒƒใƒ—ใ‚’ๆ›ธใ„ใฆใใ ใ•ใ„ใ€‚่งฃ่ชฌใฏใคใ‘ใชใ„ใงใใ ใ•ใ„ใ€‚

### Task-Based Activity: "Exploring NYC Together - Museum Invitation Challenge"  
**Task Name:** Weekend Museum Invitation Plan  
**Target Language Pattern:** ๏ฝžใพใ›ใ‚“ใ‹ (e.g., ใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚)  
**Textbook Source:** Genki I, Chapters 1โ€“3  
**Target Learners:** Beginner Japanese students in New York colleges/community colleges  
**Materials Needed:** Whiteboard, Genki vocabulary lists (Ch.1โ€“3), scenario cards, timer  
**Language Skills Practiced:** Speaking (invitations), listening (responses), vocabulary recall  
**Time Required:** 30โ€“40 minutes  

### Background Scenario (in English)  
You are a new student at a New York college and want to make friends while exploring the city. This weekend, you plan to visit the **American Museum of Natural History** (or a similar NYC landmark). Your task is to invite classmates to join you using Japanese. You must negotiate a day/time, suggest meeting points, and politely handle acceptances or refusals. The challenge: use only vocabulary and grammar from Genki Chapters 1โ€“3.  

### Key Vocabulary (from Genki Ch.1โ€“3)  
**Nouns:**  
- ใ‚ณใƒผใƒ’ใƒผ (coffee), ใŠใกใ‚ƒ (tea), ใฐใ‚“ใ”ใฏใ‚“ (dinner)  
- ใซใกใ‚ˆใ†ใณ (Sunday), ใฉใ‚ˆใ†ใณ (Saturday), ใ‚ใ• (morning), ใ”ใ” (afternoon)  
- ใ„ใพ (now), ใ‚ใ—ใŸ (tomorrow), ใใ‚‡ใ† (today)  
- ใ›ใ‚“ใ›ใ„ (teacher), ใจใ‚‚ใ ใก (friend), ใŒใใ›ใ„ (student)  
- ใˆใ„ใŒ (movie), ใƒŸใƒฅใƒผใ‚ธใ‚ขใƒ  (museum โ€“ loanword allowed in Ch.3)  
**Verbs:**  
- ใ„ใใพใ™ (go), ใฟใพใ™ (see/watch), ใŸในใพใ™ (eat), ใฎใฟใพใ™ (drink), ใ—ใพใ™ (do)  
- ใญใพใ™ (sleep), ใŠใใพใ™ (wake up), ในใ‚“ใใ‚‡ใ†ใ—ใพใ™ (study)  
**Expressions:**  
- ใ„ใ„ใงใ™ใญ (Thatโ€™s good!), ใกใ‚‡ใฃใจโ€ฆ (a littleโ€ฆ [for hesitance])  
- ใชใ‚“ใ‚ˆใ†ใณ๏ผŸ (What day?), ใชใ‚“ใ˜๏ผŸ (What time?)  

### Task Steps  

1. **Preparation (5 minutes)**  
   - Review the vocabulary list above. Write down 2โ€“3 invitation sentences using "๏ฝžใพใ›ใ‚“ใ‹" (e.g., ใฉใ‚ˆใ†ใณใซใƒŸใƒฅใƒผใ‚ธใ‚ขใƒ ใธใ„ใใพใ›ใ‚“ใ‹ใ€‚).  
   - Brainstorm how to suggest meeting at a specific time/place (e.g., ใˆใใงใ‚ใ„ใพใ›ใ‚“ใ‹ใ€‚).  

2. **Role-Play Practice (15 minutes)**  
   - Pair up. One student is the "inviter," the other is the "classmate."  
   - Inviter: Propose a museum visit using "๏ฝžใพใ›ใ‚“ใ‹." Include day, time, and activity (e.g., ใซใกใ‚ˆใ†ใณใฎใ”ใ”ใซใƒŸใƒฅใƒผใ‚ธใ‚ขใƒ ใ‚’ใฟใพใ›ใ‚“ใ‹ใ€‚).  
   - Classmate: Respond using Ch.1โ€“3 vocabularyโ€”either accept (ใ„ใ„ใงใ™ใญ๏ผ) or politely decline (ใกใ‚‡ใฃใจโ€ฆในใ‚“ใใ‚‡ใ†ใ—ใพใ™).  
   - Swap roles and repeat with a different day/activity.  

3. **Expansion and Negotiation (10 minutes)**  
   - In pairs, plan details:  
     - Decide where to meet (e.g., ใŒใฃใ“ใ†ใงใ€ใƒ‘ใƒผใ‚ฏใง).  
     - Suggest an after-museum activity (e.g., ใ‚ณใƒผใƒ’ใƒผใ‚’ใฎใฟใพใ›ใ‚“ใ‹ใ€‚).  
     - Use "ใชใ‚“ใ˜๏ผŸ" or "ใชใ‚“ใ‚ˆใ†ใณ๏ผŸ" to ask questions.  

4. **Group Sharing (5 minutes)**  
   - Each pair shares their final plan with the class in simple Japanese (e.g., ใฉใ‚ˆใ†ใณใฎใ”ใ”ใซใƒŸใƒฅใƒผใ‚ธใ‚ขใƒ ใธใ„ใใพใ™ใ€‚ใใ‚Œใ‹ใ‚‰ใ€ใ‚ณใƒผใƒ’ใƒผใ‚’ใฎใฟใพใ™ใ€‚).  

5. **Optional Challenge**  
   - If time allows, try inviting a "busy friend" (played by the teacher) who declines twice before acceptingโ€”practice persistent but polite invitations.

๐Ÿ“„ Text/PDF

DeepSeek can generate an extremely large number of outputs almost for free. See the all files (in text format) in my Dropbox folder: https://t-nagano.link/FunWithAI-DeepSeekTasks

Projects are extremely useful as student-centered homework and authentic assessment. Like tasks, however, it takes a considerable amount of time and energy to design effective projects.

It turned out that DeepSeek can generate reasonably good projects. Like tasks, it works best to generate a few (probably 4-5) different projects by adjusting prompts and at least one of them is good enough for actual classroom instruction.

Sample Prompt (DeepSeek R1)
ใ€Œ๏ฝžใพใ›ใ‚“ใ‹ (e.g., ใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚)ใ€ใ‚’็ทด็ฟ’ใ™ใ‚‹ใƒ—ใƒญใ‚ธใ‚งใ‚ฏใƒˆใ‚’ใ€ๆ—ฅๆœฌ่ชžใฎๆ•™็ง‘ๆ›ธใ€Œใ’ใ‚“ใใ€ใ‚’ไฝฟใฃใฆๅ‹‰ๅผทใ—ใฆใ„ใ‚‹ๆ—ฅๆœฌ่ชžใ‚’ๅญฆ็ฟ’ใ—ใฆใ„ใ‚‹ๅญฆ็”ŸใฎใŸใ‚ใซไฝœใฃใฆใใ ใ•ใ„ใ€‚ใ€Œใ’ใ‚“ใใ€ใฎCh.1-Ch.03ใพใงใฎๅ˜่ชžใ‚’ไฝฟใ†ใ‚ˆใ†ใซใ—ใฆใ€ใใฎ็ฏ„ๅ›ฒๅ†…ใฎๅ˜่ชžใ ใ‘ใ‚’ไฝฟใ†ใ‚ˆใ†ใซใ—ใฆใใ ใ•ใ„ใ€‚ใƒ—ใƒญใ‚ธใ‚งใ‚ฏใƒˆใฏใ€ใ ใ„ใŸใ„1ๆ™‚้–“็จ‹ๅบฆใง็ต‚ใ‚ใ‚‹็ฏ„ๅ›ฒใงใ€ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใซใ‚ใ‚‹ๆ—ฅๆœฌใซ้–ขใ™ใ‚‹ใŠๅบ—ใ‚„ๅ›ฃไฝ“ใ€ๆ—ฅๆœฌ่ชžใŒไฝฟใ‚ใ‚Œใฆใ„ใ‚‹ๅ ดๆ‰€ใชใฉใ‚’ๅˆฉ็”จใ™ใ‚‹ใ‚ˆใ†ใซใ—ใฆใใ ใ•ใ„ใ€‚ๅ ดๆ‰€ใฏๆ—ฅๆœฌใฎใ‚ฆใ‚งใƒ–ใ‚ตใ‚คใƒˆใชใฉใฎใ‚ชใƒณใƒฉใ‚คใƒณใงใ‚‚ใ„ใ„ใงใ™ใ€‚ใƒ—ใƒญใ‚ธใ‚งใ‚ฏใƒˆใฏใ€ๆๅ‡บใงใ ใ‚‹deliverableใŒๆ˜Ž็ขบใซใ‚ใ‚‹ใ‚ˆใ†ใซใ—ใฆใใ ใ•ใ„ใ€‚ๅ‡บๅŠ›ใฏใƒ‡ใƒผใ‚ฟใฎใฟใซใ—ใฆใ€ๆ—ฅๆœฌ่ชžใฎๅ˜่ชžใ‚„ไพ‹ๆ–‡ไปฅๅค–ใฏ่‹ฑ่ชžใงๆ›ธใ„ใฆใ€ใใ—ใฆใƒ—ใƒญใ‚ธใ‚งใ‚ฏใƒˆใฎๅๅ‰ใ€ๅฟ…่ฆใชๆ•™ๆใ€็ทด็ฟ’ใ™ใ‚‹่จ€่ชžใ‚นใ‚ญใƒซใ€ๅฟ…่ฆใชๆ™‚้–“ใชใฉใ‚’ๆœ€ๅˆใซๆ›ธใ„ใฆใ€ใƒ—ใƒญใ‚ธใ‚งใ‚ฏใƒˆใฎ่ฉณ็ดฐใชใ‚นใƒ†ใƒƒใƒ—ใ‚’ๆ›ธใ„ใฆใใ ใ•ใ„ใ€‚่งฃ่ชฌใฏใคใ‘ใชใ„ใงใใ ใ•ใ„ใ€‚

**Project Name:**  
"NYCใงใ•ใŒใใ†๏ผ๏ฝžใพใ›ใ‚“ใ‹ Invitation Project"

**Materials Needed:**  
- "Genki I" Textbook (Ch.1-3 vocabulary reference)  
- Internet access (for website research)  
- Worksheet (printed or digital)  
- Pen/Notebook or digital document  

**Language Skills Practiced:**  
- Grammar: ๏ฝžใพใ›ใ‚“ใ‹ (informal invitations)  
- Vocabulary: Genki Ch.1-3 only  
- Reading: Simple website navigation  
- Writing: Forming invitation sentences  

**Time Required:**  
Approximately 60 minutes  

---

**Detailed Project Steps:**

1. **Preparation (5 minutes)**  
   - Review the "๏ฝžใพใ›ใ‚“ใ‹" grammar point from Genki.  
   - Review vocabulary lists from Genki Chapters 1, 2, and 3.  
   - Open a web browser.

2. **Research Phase (25 minutes)**  
   - Visit the following real websites based in or related to New York:  
     a. **Mitsuwa Marketplace (Edgewater, NJ - near NYC)** website: Look for sections like ใƒฌใ‚นใƒˆใƒฉใƒณ (restaurant), ใ‚ซใƒ•ใ‚ง (cafe), ๆœฌ (books), ้ฃŸใน็‰ฉ (food).  
     b. **Japan Society (NYC)** website: Look for ใ‚คใƒ™ใƒณใƒˆ (events), ใƒ“ใƒ‡ใ‚ช (videos), ใ‚ฏใƒฉใ‚น (classes).  
     c. **Kinokuniya Bookstore (NYC)** website: Look for ๆœฌ, ้›‘่ชŒ (magazines), ๆ–‡ๆˆฟๅ…ท (stationery).  
   - Using **only vocabulary from Genki Ch.1-3**, identify 5 items/activities you find. Write them down in Japanese.  
     *Example Finds: ใ‚ณใƒผใƒ’ใƒผใ€ใŠใกใ‚ƒใ€ใˆใ„ใŒใ€ใปใ‚“ใ€ใƒŽใƒผใƒˆใ€ใฆใŒใฟใ€ใ‚ตใƒณใƒ‰ใ‚คใƒƒใƒใ€ใƒŸใƒผใƒ†ใ‚ฃใƒณใ‚ฐ*

3. **Sentence Creation Phase (20 minutes)**  
   - Using your 5 found items/activities, create 5 invitation sentences using the "๏ฝžใพใ›ใ‚“ใ‹" pattern.  
   - Each sentence must use **only vocabulary and numbers from Genki Ch.1-3**.  
     *Example Sentences:*  
     1. ใใ‚‡ใ†ใ€ใ‚ณใƒผใƒ’ใƒผใ‚’ใ€€้ฃฒใฟใพใ›ใ‚“ใ‹ใ€‚  
     2. ใ‚ใ—ใŸใ€ใปใ‚“ใ‚’ใ€€่ชญใฟใพใ›ใ‚“ใ‹ใ€‚  
     3. ๆœจใ‚ˆใ†ๆ—ฅใซใ€ใˆใ„ใŒใ‚’ใ€€่ฆ‹ใพใ›ใ‚“ใ‹ใ€‚  
     4. ใ“ใ“ใงใ€ใ‚ตใƒณใƒ‰ใ‚คใƒƒใƒใ‚’ใ€€้ฃŸในใพใ›ใ‚“ใ‹ใ€‚  
     5. ใซใกใ‚ˆใ†ๆ—ฅใซใ€ใƒ‘ใƒผใƒ†ใ‚ฃใƒผใธใ€€่กŒใใพใ›ใ‚“ใ‹ใ€‚

4. **Deliverable Creation & Submission (10 minutes)**  
   - Fill out the final worksheet with:  
     A. The 5 items/activities you found (in Japanese).  
     B. Your 5 original invitation sentences.  
     C. One chosen invitation (from your 5 sentences) written neatly or typed for submission.  
   - Submit your final chosen invitation sentence and the name of the website you used most to your instructor via email or the provided Google Form.  

**Deliverable Format for Submission:**  
- Student Name:  
- Website Used Most:  
- My Invitation: [One complete sentence using ๏ฝžใพใ›ใ‚“ใ‹]

๐Ÿ“„ Text/PDF

DeepSeek can generate an extremely large number of outputs almost for free. See the all files (in text format) in my Dropbox folder: https://t-nagano.link/FunWithAI-DeepSeekProjects

Although often overlooked, OpenAIโ€™s audio-generation โ€” or text-to-speech (tts) โ€” technology is a major asset. Before OpenAI's tts model, initially known as Whisper, numerous commercial and academic projects produced synthetic speech systems, but they all had noticeable accents that could alienate audiences. OpenAI's Whisper project applied large language model technology to generate synthetic speech, which outperformed previous models that attempted to emulate human phonetic and phonological patterns.

OpenAI first announced the tts-1 model in 2023 and later upgraded it to the gpt-4o-mini-tts model in 2024. As of this writing, the latest model remains gpt-4o-mini-tts (gpt-4o-mini-tts-2025-12-15).

OpenAI's gpt-4o-mini-tts offers 13 different speakers (voices) and supports audio instructions, with which you can make minor adjustments to the speech patterns. Below is a prompt example to generate speech samples.

Sample Prompt (ChatGPT tts)
for speaker in "alloy" "ash" "ballad" "cedar" "coral" "echo" "fable" "marin" "nova" "onyx" "sage" "shimmer" "verse"; do generate_speech.py --audioModel gpt-4o-mini-tts-2025-12-15 --audioSpeaker $speaker --audioInstructions "Accent: warm, refined, and gently instructive, reminiscent of a friendly instructor. Tone: Calm, encouraging, and articulate. Pacing: Deliberate, pausing often to allow the listener to follow instructions comfortably. Emotion: Cheerful, supportive, and pleasantly enthusiastic" --text "The quick brown fox jumps over the lazy dog. one, two, three, four, five, six, seven, eight, nine, ten. Sally sells seashells by the seashore. Six sleek swans swam swiftly south."; done

[INFO] Audio has been generated... ....
๐ŸŽต Audio (speech samples)
๐ŸŽง OpenAI tts speaker: alloy
๐ŸŽง OpenAI tts speaker: ash
๐ŸŽง OpenAI tts speaker: ballad
๐ŸŽง OpenAI tts speaker: cedar
๐ŸŽง OpenAI tts speaker: coral
๐ŸŽง OpenAI tts speaker: echo
๐ŸŽง OpenAI tts speaker: fable
๐ŸŽง OpenAI tts speaker: marin
๐ŸŽง OpenAI tts speaker: nova
๐ŸŽง OpenAI tts speaker: onyx
๐ŸŽง OpenAI tts speaker: sage
๐ŸŽง OpenAI tts speaker: shimmer
๐ŸŽง OpenAI tts speaker: verse

Although these models are not specifically trained for Japaense, they tend to do very well with the Japanese speech. Below is the sample of Japaense text.

Sample Prompt (ChatGPT tts)
generate_speech.py "ๆฌกใฎๆ—ฅๆœฌ่ชžใฎๆ–‡็ซ ใ‚’้Ÿณๅฃฐใซๅค‰ๆ›ดใ—ใฆใใ ใ•ใ„" --audioInstructions "Speak in Japanese. Accent: warm, refined, and gently instructive, reminiscent of a friendly instructor. Tone: Calm, encouraging, and articulate. Pacing: Natural speed. Make sure to pause for one or two seconds at the end of each sentence. Emotion: Cheerful, supportive, and pleasantly enthusiastic" --audioModel gpt-4o-mini-tts-2025-12-15 --audioSpeaker echo "้Ÿณๅฃฐใƒใ‚งใƒƒใ‚ฏใงใ™ใ€‚ใŸใ ใ„ใพใ€ใƒžใ‚คใ‚ฏใฎใƒ†ใ‚นใƒˆใ‚’่กŒใฃใฆใ„ใพใ™ใ€‚ใ„ใกใ€ใซใ€ใ•ใ‚“ใ€ใ‚ˆใ‚“ใ€ใ”ใ€ใ‚ใใ€ใชใชใ€ใฏใกใ€ใใ‚…ใ†ใ€ใ˜ใ‚…ใ†ใ€‚็”Ÿ้บฆใ€็”Ÿ็ฑณใ€็”Ÿๅตใ€‚้šฃใฎๅฎขใฏใ€ใ‚ˆใๆŸฟ้ฃŸใ†ๅฎขใ ใ€‚ใŸใ ใ„ใพใ€่‡ช็„ถใช้€Ÿใ•ใจไธ€ๅฎšใฎ้Ÿณ้‡ใง่ชญใ‚“ใงใ„ใพใ™ใ€‚ใ‚ใ„ใ†ใˆใŠใ€ใ‹ใใใ‘ใ“ใ€ใ•ใ—ใ™ใ›ใใ€‚"

[INFO] Audio has been generated... ....
๐ŸŽต Audio (speech samples)
๐ŸŽง OpenAI tts speaker: alloy (ๆ—ฅๆœฌ่ชž)
๐ŸŽง OpenAI tts speaker: ash (ๆ—ฅๆœฌ่ชž)
๐ŸŽง OpenAI tts speaker: ballad (ๆ—ฅๆœฌ่ชž)
๐ŸŽง OpenAI tts speaker: cedar (ๆ—ฅๆœฌ่ชž)
๐ŸŽง OpenAI tts speaker: coral (ๆ—ฅๆœฌ่ชž)
๐ŸŽง OpenAI tts speaker: echo (ๆ—ฅๆœฌ่ชž)
๐ŸŽง OpenAI tts speaker: fable (ๆ—ฅๆœฌ่ชž)
๐ŸŽง OpenAI tts speaker: marin (ๆ—ฅๆœฌ่ชž)
๐ŸŽง OpenAI tts speaker: nova (ๆ—ฅๆœฌ่ชž)
๐ŸŽง OpenAI tts speaker: onyx (ๆ—ฅๆœฌ่ชž)
๐ŸŽง OpenAI tts speaker: sage (ๆ—ฅๆœฌ่ชž)
๐ŸŽง OpenAI tts speaker: shimmer (ๆ—ฅๆœฌ่ชž)
๐ŸŽง OpenAI tts speaker: verse (ๆ—ฅๆœฌ่ชž)

The cost for OpenAI's tts is extremely affordable (about $0.01 per minute), which allows us to generate a large number of audio files as instructional materials.

Sample Prompt (ChatGPT tts)
generate_speech.py "ๆฌกใฎๆ—ฅๆœฌ่ชžใฎๆ–‡็ซ ใ‚’้Ÿณๅฃฐใซๅค‰ๆ›ดใ—ใฆใใ ใ•ใ„" --audioInstructions "Speak in Japanese. Accent: warm, refined, and gently instructive, reminiscent of a friendly instructor. Tone: Calm, encouraging, and articulate. Pacing: Natural speed. Make sure to pause for one or two seconds at the end of each sentence. Emotion: Cheerful, supportive, and pleasantly enthusiastic" --audioModel gpt-4o-mini-tts-2025-12-15 --audioSpeaker echo "ใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹" "ๆ˜ ็”ปใ‚’่ฆ‹ใพใ›ใ‚“ใ‹" "ใ‚ปใƒณใƒˆใƒฉใƒซใƒ‘ใƒผใ‚ฏใงๅ†™็œŸใ‚’ๆ’ฎใ‚Šใพใ›ใ‚“ใ‹" "ๅ›ณๆ›ธ้คจใงๅ‹‰ๅผทใ—ใพใ›ใ‚“ใ‹" "ใƒกใƒˆใƒญใƒใƒชใ‚ฟใƒณ็พŽ่ก“้คจใธ่กŒใใพใ›ใ‚“ใ‹" ... [snip]

[INFO] Audio has been generated... ....
๐ŸŽต Audio (speech samples)
๐ŸŽง ใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹
๐ŸŽง ๆ˜ ็”ปใ‚’่ฆ‹ใพใ›ใ‚“ใ‹
๐ŸŽง ใ‚ปใƒณใƒˆใƒฉใƒซใƒ‘ใƒผใ‚ฏใงๅ†™็œŸใ‚’ๆ’ฎใ‚Šใพใ›ใ‚“ใ‹
๐ŸŽง ๅ›ณๆ›ธ้คจใงๅ‹‰ๅผทใ—ใพใ›ใ‚“ใ‹
๐ŸŽง ใƒกใƒˆใƒญใƒใƒชใ‚ฟใƒณ็พŽ่ก“้คจใธ่กŒใใพใ›ใ‚“ใ‹
๐ŸŽง ๆ˜ผใ”ใฏใ‚“ใ‚’้ฃŸในใพใ›ใ‚“ใ‹
๐ŸŽง ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎใ‚ซใƒ•ใ‚งใงใŠ่Œถใ‚’ใ—ใพใ›ใ‚“ใ‹
๐ŸŽง ้Ÿณๆฅฝใ‚’่žใใพใ›ใ‚“ใ‹
๐ŸŽง ๆ•ฃๆญฉใ—ใพใ›ใ‚“ใ‹
๐ŸŽง ้ง…ใงไผšใ„ใพใ›ใ‚“ใ‹

There are a large number of audio files for practice sentences. See my Dropbox folder: https://t-nagano.link/FunWithAI-OpenAIAudio1

Eleven Labs specializes in AI-based speech synthesis and other audio-based services (such as transcription). In terms of quality, it probably surpasses recorded speech by non-professional humans โ€” the audio emulates human speech perfectly and does not have any unintended interruptions such as external noise or coughing. You will have absolute control over speech synthesis and can add various extra-speech features by using tags such as [surprised] and [uninterested]. There are over 1,000 speakers to choose from, and you can also use your own voice as a speaker by uploading a short (2-3 min) speech sample.

Eleven Labs is probably the best speech synthesis service (a lot better than ChatGPT's tts), but it costs a lot more than other services. On average, Eleven Labs costs about $0.2-0.3 per minute, while the average cost of ChatGPT tts is about $0.01-0.02 per minute.

Eleven Labs offers a free subscription, which comes with some credits (sufficient for 10-15 min of speech synthesis).

Below are speech samples for Eleven Labs. Eleven Labs offers over 1,000 voices and an option to use your own voice, so these are just a fraction of the speech samples that you can generate with Eleven Labs.

Sample Prompt (Eleven Labs)
generate_speech.py --model-id "eleven_v3" --voice-id Ellen "้Ÿณๅฃฐใƒใ‚งใƒƒใ‚ฏใงใ™ใ€‚ใŸใ ใ„ใพใ€ใƒžใ‚คใ‚ฏใฎใƒ†ใ‚นใƒˆใ‚’่กŒใฃใฆใ„ใพใ™ใ€‚ใ„ใกใ€ใซใ€ใ•ใ‚“ใ€ใ‚ˆใ‚“ใ€ใ”ใ€ใ‚ใใ€ใชใชใ€ใฏใกใ€ใใ‚…ใ†ใ€ใ˜ใ‚…ใ†ใ€‚็”Ÿ้บฆใ€็”Ÿ็ฑณใ€็”Ÿๅตใ€‚้šฃใฎๅฎขใฏใ€ใ‚ˆใๆŸฟ้ฃŸใ†ๅฎขใ ใ€‚ใŸใ ใ„ใพใ€่‡ช็„ถใช้€Ÿใ•ใจไธ€ๅฎšใฎ้Ÿณ้‡ใง่ชญใ‚“ใงใ„ใพใ™ใ€‚ใ‚ใ„ใ†ใˆใŠใ€ใ‹ใใใ‘ใ“ใ€ใ•ใ—ใ™ใ›ใใ€‚"

[INFO] Audio has been generated... ....
๐ŸŽต Audio (speech samples)
๐ŸŽง Eleven Labs speaker: Adam
๐ŸŽง Eleven Labs speaker: Aki
๐ŸŽง Eleven Labs speaker: Akira
๐ŸŽง Eleven Labs speaker: Akito
๐ŸŽง Eleven Labs speaker: Alice
๐ŸŽง Eleven Labs speaker: Austin
๐ŸŽง Eleven Labs speaker: Ayana
๐ŸŽง Eleven Labs speaker: Bill
๐ŸŽง Eleven Labs speaker: Brian
๐ŸŽง Eleven Labs speaker: Ellen
๐ŸŽง Eleven Labs speaker: Kaori
๐ŸŽง Eleven Labs speaker: Liam

I processed the audio for sample sentences again with ElevenLabs (see OpenAI Whisper above). The audio quality is significantly better than OpenAI Whisper.

Sample Prompt (Eleven Labs)
generate_speech.py --voice-id "Yui" --model-id "eleven_v3" "ใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹" "ๆ˜ ็”ปใ‚’่ฆ‹ใพใ›ใ‚“ใ‹" "ใ‚ปใƒณใƒˆใƒฉใƒซใƒ‘ใƒผใ‚ฏใงๅ†™็œŸใ‚’ๆ’ฎใ‚Šใพใ›ใ‚“ใ‹" "ๅ›ณๆ›ธ้คจใงๅ‹‰ๅผทใ—ใพใ›ใ‚“ใ‹" "ใƒกใƒˆใƒญใƒใƒชใ‚ฟใƒณ็พŽ่ก“้คจใธ่กŒใใพใ›ใ‚“ใ‹" ... [snip]

[INFO] Audio has been generated... ....
๐ŸŽต Audio (speech samples)
๐ŸŽง ใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹
๐ŸŽง ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎใ‚ซใƒ•ใ‚งใงใ‚ณใƒผใƒ’ใƒผใ‚’้ฃฒใฟใพใ›ใ‚“ใ‹
๐ŸŽง ๅ›ณๆ›ธ้คจใงๅ‹‰ๅผทใ—ใพใ›ใ‚“ใ‹
๐ŸŽง ใ‚ปใƒณใƒˆใƒฉใƒซใƒ‘ใƒผใ‚ฏใ‚’ๆ•ฃๆญฉใ—ใพใ›ใ‚“ใ‹
๐ŸŽง ใ‚ใ—ใŸใ„ใฃใ—ใ‚‡ใซๅคงๅญฆใธ่กŒใใพใ›ใ‚“ใ‹
๐ŸŽง ๆ—ฅๆœฌ่ชžใฎใ‚ฏใƒฉใ‚นใงใ„ใฃใ—ใ‚‡ใซๅ‹‰ๅผทใ—ใพใ›ใ‚“ใ‹
๐ŸŽง ใƒ‹ใƒฅใƒผใƒจใƒผใ‚ฏใฎใƒฌใ‚นใƒˆใƒฉใƒณใงๆ™ฉใ”ใฏใ‚“ใ‚’้ฃŸในใพใ›ใ‚“ใ‹
๐ŸŽง ้ง…ใงไผšใ„ใพใ›ใ‚“ใ‹

There are a large number of audio files for practice sentences. See my Dropbox folder: https://t-nagano.link/FunWithAI-ElevenLabsAudio1

Eleven Labs offers a great deal of control over speech synthesis and you can make a dialogue among multiple different people (voices) too. Below is an example of the dialogue speech synthesis.

Sample Prompt (Eleven Labs)
generate_speech.py --model-id "eleven_v3" --dialogue-voice-map "Male1:Akira, Female1:Shizuka, Male2:Austin, Female2:Ellen"
Female1: [warmly] ใ“ใ‚“ใซใกใฏใ€‚ใŸใ‹ใ—ใ•ใ‚“ใ€ใŠๅ…ƒๆฐ—ใงใ™ใ‹ใ€‚
Male1: ใฏใ„ใ€ๅ…ƒๆฐ—ใงใ™ใ€‚ใ‚†ใ‹ใ‚Šใ•ใ‚“ใ€ใ“ใกใ‚‰ใฏๅ‹้”ใฎใƒžใ‚คใ‚ฏใ•ใ‚“ใจใ‚ตใƒฉใ•ใ‚“ใงใ™ใ€‚
Female2: [curious] ใฏใ˜ใ‚ใพใ—ใฆใ€‚ใ‚ตใƒฉใงใ™ใ€‚ใ‚ขใƒกใƒชใ‚ซไบบใงใ™ใ€‚ๅญฆ็”Ÿใงใ™ใ€‚
Male2: ใฏใ˜ใ‚ใพใ—ใฆใ€‚ใƒžใ‚คใ‚ฏใงใ™ใ€‚ใ‚ขใƒกใƒชใ‚ซไบบใงใ™ใ€‚ใ‚ˆใ‚ใ—ใใŠ้ก˜ใ„ใ—ใพใ™ใ€‚
Female1: ็งใฏใ‚†ใ‹ใ‚Šใงใ™ใ€‚ๆ—ฅๆœฌไบบใงใ™ใ€‚ๅคงๅญฆใงๆ—ฅๆœฌ่ชžใ‚’ๅ‹‰ๅผทใ—ใฆใ„ใพใ™ใ€‚
Male1: ็งใฏใŸใ‹ใ—ใงใ™ใ€‚ๆ—ฅๆœฌไบบใงใ™ใ€‚ๅญฆ็”Ÿใงใ™ใ€‚
Female2: [thoughtful] ใ‚ใฎใ†ใ€ๆ—ฅๆœฌ่ชžใฎๆœฌใ‚’่ชญใฟใพใ›ใ‚“ใ‹ใ€‚
Male2: [surprised] ๆœฌใงใ™ใ‹๏ผŸใ„ใ„ใงใ™ใญใ€‚
Female1: [warmly] ใใ†ใงใ™ใญใ€‚ไธ€็ท’ใซๅ›ณๆ›ธ้คจใธ่กŒใใพใ›ใ‚“ใ‹ใ€‚
Male1: ใˆใˆใ€่กŒใใพใ—ใ‚‡ใ†ใ€‚
Female2: [laugh] ใ‚ใ‚ŠใŒใจใ†ใ”ใ–ใ„ใพใ™ใ€‚
Male2: [curious] ๅ›ณๆ›ธ้คจใฏใฉใ“ใงใ™ใ‹ใ€‚
Female1: ๅคงๅญฆใฎ่ฟ‘ใใงใ™ใ€‚
Male1: [warmly] ใ˜ใ‚ƒใ‚ใ€ไธ€็ท’ใซ่กŒใใพใ—ใ‚‡ใ†ใ€‚
Female2: ใฏใ„ใ€ใŠ้ก˜ใ„ใ—ใพใ™ใ€‚

[INFO] Audio has been generated... ....
๐ŸŽต Audio (speech samples)
๐ŸŽง Ch01_Noun1ใฎNoun2 (audio)
๐ŸŽง Ch01_QuestionSentences (audio)
๐ŸŽง Ch01_XใฏYใงใ™ (audio)
๐ŸŽง Ch02_Nounใ˜ใ‚ƒใชใ„ใงใ™ (audio)
๐ŸŽง Ch02_Nounใ‚‚ (audio)
๐ŸŽง Ch02_Nounใ‚‚ (audio)
๐ŸŽง Ch02_ใ“ใ“ใใ“ใ‚ใใ“ใฉใ“ (audio)
๐ŸŽง Ch02_ใ“ใฎใใฎใ‚ใฎใฉใฎ๏ผ‹Noun (audio)
๐ŸŽง Ch02_ใ“ใ‚Œใ‚ใ‚Œใใ‚Œใฉใ‚Œ (audio)
๐ŸŽง Ch02_ใ ใ‚ŒใฎNoun (audio)
๐ŸŽง Ch03_๏ฝžใพใ›ใ‚“ใ‹ (audio)
๐ŸŽง Ch03_FrequencyAdverbs (audio)
๐ŸŽง Ch03_Particles (audio)
๐ŸŽง Ch03_TheTopicParticleใฏ (audio)
๐ŸŽง Ch03_TimeReferences (audio)
๐ŸŽง Ch03_VerbTypesandthePresentTense (audio)
๐ŸŽง Ch03_WordOrder (audio)

There are some audio files generated by ElevenLabs in my Dropbox folder: https://t-nagano.link/FunWithAI-ElevenLabsDialogue

Last update: